I’ve just watched a video of the Blackberry Playbook’s UI. Among many interesting things, a little detail attracted my attention : the way the device does multitask is left at the user’s option, as the tablet can multitask in two distinct ways.
- Only system tasks may run in the background, applications are frozen or closed when they are put there (the iOS/Windows Phone 7 way)
- All applications are permanently running (the desktop OS/Symbian/Android way)
The fact that RIM engineers have left both options around is highly interesting, because it means that they haven’t found a right answer to the multitasking problem. And indeed, there’s an ongoing debate about how multitasking should be done. And although the discussion is mostly about mobile device, evolutions of multitasking could benefit to the desktop as well. In this post, I’m having a deeper look at this, and suggesting a way multitasking could work in this OS.
Two main approaches and one flawed compromise
At first sight, it looks like a fundamental schism.
Proponents of the “frozen background” approach say that users interact with only one application at a time, and that it’s therefore only natural to pause background tasks in order to save battery, improve performance, and avoid overloading the user’s mind. As an example, what happens when, on a phone, a SMS arrives when the user is watching a video or playing a game ? Should the user have to manually pause the video or the game before reading this SMS, or should he just switch to the SMS application while those are silently paused ?
Proponents of the “always on” approach say that having the system violently freeze applications without asking breaks a number of desirable behaviors. They can invoke the examples of a music streaming application that runs in the background while the user is typing a mail, or of situations where a network connection must be kept alive in order to avoid frustrating repetitive logins, as is the case with some proxies and VPN networks. They also claim that one of the main benefits of multitasking is the ability to go and do something else while some calculation or download is running, a benefit that fully disappears with the “frozen background” approach.
It is obvious that in this debate, both sides are right, so compromises have been created. The simplest of such compromises has been introduced by iOS 4, and will also be used in the upcoming “multitasking” update to Windows Phone 7 as far as I know. It is to let applications give a certain amount of work to do to the system while they are sleeping in the background. As an example, a streaming application could give the stream of multimedia data currently being played to the operating system, so that if it is put in the background the system will be able to continue playing that stream.
It doesn’t take a genius to figure out that this is just a workaround for the limitations of the “frozen background” approach, which won’t be sufficient for one simple reason : it’s the OS manufacturer that arbitrarily decides what applications can and can’t do in the background, and it faces one of the main challenges that anyone who’s been working on an OS for some time knows : it can’t plan everything.
Let’s take Skype phone calls, as an example. Two users communicate through an audio stream that’s encrypted using a proprietary algorithm. For this to work with this model, OS manufacturers would have to implement Skype’s proprietary algorithm in its OS. Every single other algorithm that’s not explicitly vetted by them faces the same problem : if our streaming application wants to use a DRM that has not been implemented in the OS, it can’t work in the background. If it uses a royalty-free audio codec that’s not approved by OS manufacturers, it can’t work in the background. And so on.
There are two ways proponents of this approach can address this issue : either meticulously introducing hacks in the system for every single operation that developers want to see happening in the background (like with the famous “voip” iOS4 hack, used by the Skype iOS app), thus creating yet another bloated and inconsistent OS, or putting their fascist hat on and banning every application that is not satisfied with the available set of operations. As neither of these options is desirable in the long run, it appears that this approach is a dead-end, and that OS manufacturers have to trust developers a bit if they want to go any further.
Trusting developers… but not blindly
However, letting developers run tasks in the background certainly does not mean that the only way to do decent multitasking on a personal computing device is to simply run all threads from all processes simultaneously in a round robin fashion. There are three things which the operating system can, and in my opinion should do :
- On battery-powered devices, apply a stricter power management policy to background tasks, to make sure that they do not empty the battery while the device seems idle.
- Ensure that foreground tasks do not see their performance noticeably affected by background tasks.
- Freeze tasks which permanently request the user’s attention and don’t have to do anything else, like watching a movie and playing a game, when they are put in the background.
The first two items form together an interesting scheduling problem. System designers must start with the concept of “foreground task” which the user is currently interacting with, and find a way to automatically associate it with a number of threads. Those threads will be labeled as “foreground threads” and will be treated differently by the operating system. A higher priority is a bare minimum, but with developer’s cooperation and/or a well-done and widely used system API, it’s possible to do more. As an example, tasks which are scheduled to run at regular intervals may run less often. Tasks that do not need to run in the background may be completely frozen, as is the case with recent releases of Flash Player where graphics rendering is stopped when a video is in a background tab. And so on.
The trick is that one should not optimize too aggressively. As a simple example, if a thread plays or record an audio stream, even if it’s from a process in the background, it’s in fact a foreground task and it has the requirements of foreground tasks (playback/recording must not skip). Imagine, as an example, a vocal communication where people are scheduling an appointment while looking at their calendars : the phone call must remain at top priority, with good communication quality and no connection drop.
When applications use system APIs, we can easily separate foreground and background tasks by implementing those notions at the core of said APIs. But as the iOS 4 example above shows, the system API can’t handle every single application’s needs, and we can’t expect all applications to use it although most should. So we have to provide a mechanism for applications which do things their own way, permitting them to inform the system scheduler of their actions and help it decide. Preferably, this mechanism should also address the third concern mentioned above, by allowing highly interactive tasks to be frozen when they go in the background.
And this is the area where developers should be trusted.
Thread scheduling policies
We’re talking about multitasking here, so the core object is not the process but the thread. In the scheme described above, a background thread may have three qualitative behaviors (which can exist in several flavors) :
- Foreground : Full priority, full resource consumption, this is how threads of an application that’s currently being used are treated.
- Daemon : Although the thread is still running, it’s not considered important by the OS and an aggressive power, network access, and CPU time management policy is in place to make sure that it doesn’t hurt foreground tasks’ performance and to reduce its impact on the devices’ battery life if there’s a battery around. This is the state which most background services are in.
- Frozen : The thread doesn’t run at all. Suitable for threads which are only useful when they directly interact with the user, like games or office suites.
From that, we immediately extract our three main scheduling policies, specifying how threads should be treated when they are in the background : they may be treated as foreground tasks (ex : audio communication), daemons (ex : download, VPN, computation, polling services like mail clients or SNS), or frozen tasks (ex : games, toy apps).
A developer who knows what he’s doing should be able to select such a scheduling policy by himself. In a C-like language, this would work this way : if a thread is created using something like this…
Thread* t = createThread();
…then we could offer an optional parameter to override the default scheduling policy, like this…
Thread* t = createThread(schedForeground);
…or offer a way to modify the thread’s scheduling policy afterwards through a special function, like that :
And now is the time for the big question : when the system can’t take scheduling decisions based on API calls and the dev hasn’t specified anything, what should the defaults be ? Daemon or frozen ? Although we’ve managed to produce something which works well in all situations, we end up going back to a lighter variant of the dilemma above.
My take is to put background tasks in daemon mode as a default, because it mirrors the way the world we’re living in works. Moving things don’t stop moving when we stop looking at them, although young babies are pretty much convinced of it. So why should threads do that ? There are some cases where it’s suitable, but it shouldn’t be the default, as otherwise unexperienced developers would make this OS yet another iOS where computations are stopped as soon as you stop watching them.