Whew, that was exhausting. Have I just been working on this post for, like, more than a month ? Even if my second attempt at having my computer repaired is likely to disturb this wishful thinking, expect less posting and more coding in the upcoming weeks, as I miss the smell of fresh code in the morning quite a bit.
Computer user interfaces have always been a matter of heated debates. One recurring theme in those, for which unscrupulous marketing claims are likely to blame, is the belief that popular new user interface concepts must be somehow ground-breaking, and should totally replace their older cousins in the long run for the greater good. We saw it with mouse-driven GUIs, and now we’re seeing it again with touch interfaces. In this article, I will attempt to challenge this ideology, and show how modern user interfaces can be functionally described using concepts from the 70s like commands and data pipes. I will then go on to explain how this remarkable consistency could be used to build more powerful and versatile user interfaces.
Command, and the computer shall obey
But first, what is it that I call a command ? For the purpose of this article, I will use a definition that is fairly close to that of a function in an imperative programming language: a written order to the machine, in an English-like language, that may take a number of parameters so as to define the specifics of said order. Now, how widespread is the command concept inside of modern user interfaces ?
- Command line interfaces are fairly obviously command-based, and represent the extreme case where users write commands themselves. They are undiscoverable, and thus reserved to expert use, as it’s pretty much impossible to figure out what a CLI-driven machine is capable of and how it is done without the help of extra documentation. However, CLIs scale well (a computer can provide thousands of commands without issue, though at some point namespace collision can become a problem), and are easily scriptable (just put a bunch of return-separated commands in a text file).
- Imperative programming languages, which comprise the vast majority of programming languages in use today and thus are the main way used to write software today, are command-based by a choice of definition. They have very similar properties to command line interfaces, as CLIs essentially use an imperative programming language for man-machine communication.
- Menus are the visual representation of a hierarchical tree of commands, typically accompanied with limited documentation in the form of tooltips, and keyboard shortcut for efficient expert use. Extra command parameters, if applicable, are provided either through submenus or pop-up windows. A well-designed menu hierarchy makes the feature set of software discoverable without much pain, but this approach scales less well to large feature sets than CLIs as even the best hierarchy will start to become confusing at a few hundreds of entries. Like other standard GUI controls, current-gen menus are not scriptable, unless software provides explicit support to do so.
- Toolbars contain permanently visible shortcuts for software commands, generally duplicating those available through the menu hierarchy. In most software, there are just a bunch of buttons, organized in a simple geometrical layout and labeled with either textual names or just a small picture if space is lacking (with the hope that people will remember to check tooltips so as to understand what each button does). However, more clever combinations exist, such as the URL bar of modern web browsers which commands searches within the web history with a pop-up display as the user types, and provide access to another command for going to an URL as the user validates a choice with the Return key in a single interface element. Toolbars scale much more poorly to large amounts of commands than menus, and a hundred simultaneously visible toolbar controls is already too much.
- Contextual “pop-up” menus are another way to take shortcuts in the command hierarchy, this time by restricting the displayed commands to those that can apply to the right-clicked object, or a frequently used subset of them in complex software. Apart from this, they are fairly similar to menus, with the catch that they must contain much less items to be useful, maybe nothing more than a dozen top-level items.
- Finger gestures are the latest fancy way which we have found to interface with computers. Supposed to allow a whole new bunch of computer activities like the mice-driven GUIs did in their time, they have failed to be anything more a more portable replacement of computer mice so far, which use screen estate less efficiently but provides a more intuitive way to do fine-grained scrolling and zooming. More than about six gestures on a single screen is a sure recipe for disasters, and no way has been found so far to cleanly document those to users in a non-intrusive way. Gestures can be considered as a command shortcut, with the gesture amplitude optionally acting as a parameter.
So it seems that after all most modern computer user interface concepts still relate to giving orders to the machine, either explicitly (as with command lines, menus, and basic “button” toolbar controls) or implicitly (as with more complex toolbar controls and touchscreen gestures). That they are all directly or indirectly command-based. Due to this, one could well imagine putting software UI and logic in two separate processes that solely communicate via RPC commands, and this is actually how I would like to build my UI layer, as discussed before on this blog.
To quickly recall this, UI rendering and event handling could be done within a system process, instead of by a shared system library called within a user process as is common nowadays. This in turn, allows for some significant performance optimizations : applications could start up faster (no need to re-initialize a GUI toolkit all over again), respond more quickly to user input even when the system is under load (since the UI rendering process is a system process, it can be entrusted with the ability to run its basic event handlers with RT priority), use up less system resources (if all UIs are managed within the boundaries of a single process, it can only maintain one copy of resources that are shared between processes at run time, whereas library-based toolkits can only do this with static resources without breaking process isolation).
But it does not stop there. One could also envision controlling software with such RPC-based backends from the command-line, provided that the shell scripting language is sufficiently advanced to manipulate the required programming concepts (one could imagine a “rpc PID.RPCFunc()” syntax, as an example). Aside from giving more power to people who prefer CLI interfaces, like those who are physically unable to use a keyboard and a mouse, this would also make GUI software easily scriptable, since authors would only have to standardize and document the RPC interface used by the GUI layer somewhere in order to make it an API. It would become possible to separate user interface design work from backend design work more cleanly than is the case today, kind of like Qt Quick does but even after the backend has been compiled to machine code. I think that all these are interesting perspectives to look at.
SaaP (Software as a Plumber)
Now, commands as an abstraction are enough to represent the way users give computer software work to do, but as part of this work, software often have to exchange data with hardware. Think about loading stuff from a mass storage device, sending audio and video to multimedia I/O peripherals, communicating with a network…
In the 70s, back when the most sophisticated kind of content which computers could deal with was unformatted ASCII text, UNIX designers tried to standardize all OS/hardware interfaces through the concepts of files (a named container in which text can be stored and/or retrieved) and pipes (an IPC interface used to transfer text between programs). At the time, heavily use of text-based communication protocols helped solving the complex problem of communication between computers of heterogeneous architectures.
But now, although we still communicate with software by giving orders to it, things have changed a lot on this front, as we computer users hardly live in a purely text-based world anymore. Pictures, sound, and videos, as an example, cannot be efficiently encoded with ASCII text, which is why no sane person does it. Even text itself, when formatted, is often distributed in a compressed binary form, so as to save bandwidth. And the rare bits of raw text that remain are not encoded using ASCII either, but using some complex variant of Unicode, so these old byte-based pipes may not even able to carry modern streams of text in a machine-agnostic way without the help of an extra protocol anymore. As it happens, it seems that the OS cannot inflict a One True Structure to the streams of data that flow between process, and that so-called “binary protocols”, based on arbitrarily structured streams of data, have won.
Even if UNIX pipes are now pretty much dead and buried under layers upon layers of protocols and abstractions as far as end-user software is concerned, the idea of a “dumb structured data transfer channel”, or message passing channel for short, is very much alive on its side. Software need to pass around structured data to other software or hardware or to store it in files all the time. So if the OS frameworks managed to provide a set of standard data structures that fit the needs of modern software, along with standard typed channels that can efficiently pass around all of these objects – which is very hard work, don’t misunderstand me, then we could envision applying the powerful idea of a re-routable data pipe to more sophisticated work than console computer operation.
The JACK Audio Connection Kit and its proprietary cousin Rewire are two technologies in existence today which showcase how much potential such an idea has, in the specific realm of audio processing. In a beautiful extension of the UNIX spirit to audio data, instead of having audio enthusiasts deal with huge monolithic audio-processing black boxes with bloated, OS-specific and proprietary plug-in standards, JACK proposed to simply use lots of small, independent audio software together, by routing their audio outputs into each other’s inputs. But the potential of reroutable audio streams as a core system abstraction could actually go even further : imagine for a second routing the output of different applications to different physical sound card outputs has, as an example. On a laptop with both line-out and a headphone jacks, the line-out could be sent to an amplifier and used as a “public” output for stuff which the computer user wants to share around, while the headphones serve as “private” output for stuff which he prefers to keep for himself. As related interesting concept, imagine a scenario in which a computer application could output data on another computer’s speakers without having been explicitly designed for it. Such is the great power of standard data communication channels…
What it could be like
Noise in the lecture hall miraculously toned down as the professor entered the room and put his backpack on the desk. He then grabbed his computer from it, connected the room’s various devices to it, and began to speak.
“Good morning everyone, and welcome to the “Programming for Artists” lecture !”, he said. “My name is Ernest Doppler and I am going to show you how computer systems can be used for artistic purposes beyond being a replacement to real-world tools. But first, let me announce you that due to popular demand, live audio and video streams for this course are now freely available from my computer, and that you can freely record them for private use.”
The students noisily rejoiced to these news, especially Stephanie Desmond, on the front row, who suffered heavy hearing losses of genetic origin and had a hard time making sense of speech even with the assistance of a hearing aid. She quickly looked up the teacher’s name on the university’s internal network, found his computer, and connected the audio output of his lavalier microphone to her favorite speech-to-text software. This resulted in instant improvements to the quality of the transcript as compared to that obtained from her computer’s cheap and ill-placed internal mic. She then laid back and enjoyed the combined view of the teacher and his voice’s textual transcription, as the transparent text on her holographic display acted like a foreign movie’s subtitles.
Meanwhile, the professor silently whispered admonitions in the open audio channel of the most noisy students’ phone headsets. Then, as everyone calmed down, he resumed his lecture’s introduction.
“For centuries, artistic use of computers has mostly revolved around manipulating digital equivalents of physical creation tools. It is only around the first third of the 21th century, as embedded computers became increasingly cheaper and easy to program, and as compact holographic displays and sensors began to enter the consumer market, that computer geeks, industrial designers and artists started to look at each other with mutual professional interest in the eye. What emerged from their cooperation is today known as sensitive objects, small and affordable reprogrammable computers that serve no other purpose than being a host for interactive works of art.
Sensitive objects and the associated content were a huge commercial success, largely eclipsing the sales of other artistic creations for more than 10 years. As usual, many analysts predicted that other forms of art such as music, writing, photography or cinema would completely disappear in favor of sensitive object-based variants. And as usual they were wrong. What was true, however, was that the demand for artists capable of designing sensitive object “percepts” soon exceeded the – at the time – meager supply of artists that could do more with a computer than using it as a glorified pencil and shouting at it when it didn’t work. This is when many trans-disciplinary formations like the one you are following, at the frontier between science and art studies, made their appearance.”
The professor took a breath as students taking notes caught up with him. Here’s yet another area where analysts turned out to be completely wrong, he thought with a smile on his face. Although virtual paper sheets have nowadays largely replaced physical ones as a lecture note support, hand-written notes remain the main way through which students learn their courses. Along the tumultuous history of teaching in the digital era, no amount of disastrous attempts at “e-lectures” managed to negate the biological fact that many students have a strongly gesture-driven memory, and won’t properly remember something if they don’t do the effort of summarizing it in their head and writing down that summary first.
“My goal is that at the end of this course, all of you will able to take advantage of your other artistic and scientific courses in order to build wonderful lively things that anyone with a commercial sensitive object can admire and play with. To this end, I will first show you how you can take advantage of the artistic software installed on your computer to quickly build percept prototypes that you can easily show around and use as drafts, then discuss in latter lessons the more hard-core programming practices that will be required in order to turn that prototype into a finished work of art that even the most ancient and low-end sensitive objects out there can render without issue. Let’s conclude this introduction with a first example of a percept’s design process…”
The professor switched software on his computer, from the course’s slides to a program that filled the classroom’s display area with black.
“This program is a fairly simple example of a sound generator. It detects the position of my hand in a plane, and emits a sound whose pitch and phase – if you remember that part of your acoustics course – are directly linked to its position. As I move my hand from the left to the right, the pitch flows from low-pitched to high-pitched, and as I move my hand up and down, the phase is adjusted. I have also added a pretty colorful effect at the position of the finger so as to give the thing a more “magical” feeling that encourages children to toy with it.”
The professor demonstrated that behaviour by waving his hand above his desk in a synaesthesia of sound and sparkles, and a few rather scientifically-minded students routed the audio output of the room’s speakers to a spectrum analyzer on their computer so as to get a better feel of how the sound generation worked.
“Now, developing such a sensitive object percept takes quite a bit of time. In fact, it’s the kind of project that you can expect working on at the end of this semester. But when I initially came up with the idea, I did not want to spend a few weeks dealing with computer code before knowing how it would look and feel in practice. I wanted to get a version of that program quickly running on my computer, and then to toy around with it a bit until I have a fair idea of what I want to create. Fortunately, modern computer operating systems and software let us do exactly that.”
And having pronounced these words, the professor switched back to course slides with a large diagram that described how the program worked.
“The OS will take care of the hard job of locating a hand in the holographic sensor’s active region for us, since it needs that to react to us pointing at things in our software with fingers and pencils. Through what is called the “pointer management subsystem” – there are some documents on the lecture’s web page if you want to know more about that one, it mathematically abstracts it as a dot in a box, complimented with other details that are of no interest to us right now. For sound generation purpose, I took an organ MIDI virtual instrument that I had around for musical work, and for the sparkles, I used a 3D real-time drawing program.
Since those various software which I already have at hand do the most difficult work, all I had left to do was to write a small program that acts as a glue between these various software and makes them work together. For the sound part, I did this by taking a look at the MIDI standard so as to know how musical notes are written in my organ instrument’s language. For the visual part, I used my 3D modeler’s script interface – more on that in a minute – to programmatically generate a black scene, add a particle generator to it, and then move it around to follow the hand and show and hide it as needed.
Doing all this just required me to read and understand the standard documents describing the input and output of my various programs so as to know what kind of stuff my own program was supposed to receive from the OS and send to its audio and video slaves. And thus, in half a day, I had a minimalistic program that made use of my other software to achieve the desired effect, and that I could use to toy around and improve my understanding of what I wanted to do.
Now, this quick-and-dirty version of the program was only useful to make up my mind on what I wanted to achieve, since in order to work, it used a virtual instrument and a 3D drawing software that I have paid for and simply cannot give away to every sensitive object user. So as a second step, I wrote a new, more complex version of the program that only relied on standard OS components that sensitive objects are guaranteed to come up with. I thus had to write my own sound and visual effect generator, which took a lot more time to get perfectly right, which is why I was happy that I had first checked the feasability of my idea on a small prototype.”
The professor then took off his glasses, rubbed them against his clothes to clean them up, and put them back on his nose.
“Now, just to evaluate your level of programming culture, who among you can tell me what a variable is ?”