System design 4 – Inter-Process Communication

A core part of any micro-kernel operating system design is the need to slice it in many independent process. Independence means that they may work without knowing about their neighbors. It doesn’t mean that they can’t communicate with each other. In fact, communication between processes is extremely important in such a process-centric design. In the core of this operating system, there will be a lot of different ways for processes to communicate with each other, in order to fit various performance/functionality needs. Even though this seems to get apart from the micro-kernel philosophy, it allows cleaner code to be developed and hence favors reliable and efficient code. Let’s go into more details about this.

SIGNALS

You’ve already heard about interrupts, those tiny bells that the processor can ring when it requests attention. Well, signals are the equivalent of interrupts at the process/thread level : they are bells that other processes (including the core) may ring in order to attract that process’s attention. Signalling should by its very nature be fast, but does not have to permit transmission of much information. It may be used in combination with other forms of interprocess communication for that purpose.

As we’ve stated before, there are two ways a process may handle a signal :

-The signal modifies the behavior of a thread. This includes a waiting thread being awakened by a signal, or the thread ending its current operation brutally because of the signal (often required for error handling purposes).

-A new thread is created when the process receives the signal, running a signal handler function which takes care of the signal by itself, without disturbing operation of the rest of the process.

Signals will, like interrupts, be assigned a unique integer number : there will be signal 11, signal 124, and so on. Some signals will be reserved by the system, where others are freely usable by applications.

DATA SENDING/RECEIVING

A little less primitive and more power-savvy than signalling is the ability for processes to send themselves some chunks of data (text, pictures, and so on). Most older OS just know how to send byte streams between processes. This used to be fine in the old ages of computing where text-only interface used to be the king, but now it’s proving to be an unneeded burden when people have to convert their data to bytes all by themselves.

Take, as an example, a webcam driver. It gets a low-res video stream from the USB device. It decodes it, turns it into a pack of images coming at a regular rate. Okay, so now what ? Should we encode it back to an array of bytes, send it to somebody, then translate it back to an image, knowing that some byte values (like 0) will be misunderstood by the system due to text-only interface standards ? No. Clearly, one should be able to send a stream of data of any type to another process, without thinking about it much more than by using a process.send(object) like syntax.

Actually, for safety and convenience reasons, there will be two kind of data streams available : typed data stream (only one kind of data may get through it) and untyped data stream (any kind of data may get through it). The former are easier to manage, include error protection, and cover most use cases, so they should be used where possible. The latter is provided for the cases where it is really needed (common use case being displaying various things on the screen, from integers to pictures, using a single consistent way). Encrypted versions of data streams may be provided later if they prove to be useful for security purposes, but since it’s just about adding up encryption capabilities on top of usual data streams, it is manageable by shared libraries and hence support for it is not needed in the kernel.

Sent and received data will be buffered. This way, processes won’t get blocked while their data is being sent by the OS. Process will be able to have as much data inputs/outputs as they want, though they will have, as a default, 2 text output flows and 1 text input flow, for legacy reasons. Streams will be managed like mailboxes : input streams have got some storage space where incoming data is stored until the receiver wishes to take a look at it. If it’s full, old data will be deleted in order to make room. Like with mailboxes, the receiver has the option to ask for a signal-based notification when data is received. It may wait for a “mail”, too, but for obvious stability reasons this behavior won’t be recommended nor encouraged.

SHARED MEMORY

With signals and data sending, interprocess communication at a core level is starting to shape up. But there’s a drawback to data sending : when a process sends a huge chunk of data to another one, a long and memory-consuming copy of that data has to be made. A copy-on-write system (not actually copying the data unless the second process tries to modify the “copy” it received) may be envisioned, but for small amounts of data it’s overkill so arbitrary thresholds have to be set, which is a bag programming practice because it doesn’t match hardware evolution. And then there’s the problem of collaboration of several processes on a single chunk of data : lots of copies are needed in order to make updates on each process’s progress, even when using copy on write.

All this is unneeded on desktop computers : since the RAM is freely available to all processors, and shared between them, all processes could just make use of a single chunk of memory (provided that they’re careful and make sure that they don’t all write at the same place at the same time). This violates the principle that memory of processes should remain private if they want it to be, however. So this should only be done when both processes want it.

The protocol I propose, inspired by the way most IM software shares files, is the following : using some way planned in advance, Process 1 notifies Process 2 of its sharing intentions, and then tells the OS it wants to share some data. The OS puts that data apart from other data for security reasons, then makes it ready to be shared between processes once process 2 accepts that. It then sends a pointer/reference to the not-yet-shared data to process 2. All process 2 has to do in order to acknowledge sharing is to use that pointer, and the OS will then turn memory sharing on automatically. The OS should provide the mechanism, the rest can be done through library functions.

DISTANT PROCEDURE CALLS

Shared memory is a way to improve sending/receiving of data in some specific cases. Almost anything it does can be made through send/receive functions, but it gives more power and performance to developers. RPC is the same, but about signalling. When you send a signal to a process, it’s generally that you want it to do a specific task. But what if accomplishing this tasks requires some input from the demanding process ? Sure, we can send required input through data streams, but it’s complicated and error-prone (what if you send a pointer ?). Therefore, a better mechanism has to be implemented in order to run functions from a distant process with parameters.

First, said process has to tell the OS that it actually wants some processes to mess with part of its private functions, for obvious security and stability reasons. It does so by telling how that function is called, and which kinds of parameters it takes (in the same way he declares it). Then the requesting process may use some operating system abilities to run that function from the distance, with automatic pointer management as you like it. There should be also some way to manage functions that return some results to the process which called it, even though this requires this process to wait while the other works and possibly hangs, and hence is only here for the sake of completeness.

SYNCHRONIZATION

When several threads or processes work on the same part of memory, it must sometimes be ensured that they’re not granted access to it at the same time. It’s easier to visualize that when talking about hardware rather than memory : do you want two word processors to print data on the same printer at the same time ? One prints one line, one prints another line… Got it ?

There are several options to solve this problem, but the most primitive one and often the basis of all others is that of a semaphore. It’s about limiting access to a resource (be it whatever you want) to a fixed number of process at a time (generally one, in which case our semaphore is called a mutex for MUTual EXclusion). Semaphores sound simple, but actually making them work requires help from the operating system and the hardware, and actually there’s still research work on how to make them better as of today.

Another way of synchronizing processes and threads is called a barrier. It’s about waiting for all threads/processes to have completed a specific task before continuing the work. Use of it should be pretty obvious.

Managing inter-process synchronization makes the task even harder, but not that much. It’s essentially about making the used variable shared between the processes, which we are now able to do.

PROCESS CAPABILITIES

We’ve now described the bulk of our inter-process communication concepts, but a last refinement should be added to make the concept really fully complete. It’s about how one finds a process to communicate with. The most obvious is the file name of the program, along with the place where it’s stored in memory if there’s several programs that respect the same definition. It does work indeed, as long as there’s not two programs with the same name around, and as long as the user does not modify it.

The second problem may be avoided by secretly storing the original filename of the program somewhere and/or giving it an “internal name” that doesn’t change and is told to the OS as the program is initialized. But it does not solve the first issue, that is less common but should be examined.

How should we reduce the probability that two processes with the same name conflict with each other ? By knowing what the process we look for does. Indeed, having two processes with the same name doing the same thing is unlikely… Unless we only identify what a process does by its name. Ooops, that’s a logical loop, does not sound good.

We may get out of this loop by having each process carry around an array of data, called “capabilities”, and describing one or more task that the process is able to do. That array is filled by the process itself as it is started. Now if we want to do some task, we look for a process that’s able to do it. If there’s several processes for that, we ask the user which one we have to choose. On further launches, we use this one and don’t make it wonder about that again.

IPC CONCLUSION

So this ends this long discussion about the fascinating subject of inter-process communication. We try to be as complete as possible on the subject, in order to favor a multi-process programming model that is more crash-proof. You may have notice that there is no “user space” section here. That is because IPC is intrinsically part of the notion of process, and can’t be moved to a separate user process or user library (except some high-level parts, like the encrypted stream thing), without making this process more limited or more powerful than others. This contradicts one of our main rules : process separation must be ensured in the kernel only if all processes remain equal. If not, there’s not point in moving work out of the kernel : it does not make development cleaner, it rather makes it more dirty. And dirty programs generally aren’t satisfactory.

Next issue to be discussed is security. Having kernel and user parts of the system separated is not useful if the kernel allows user processes to do anything, and hence there should be a reliable way to prevent them from doing what they don’t have to do. We’ll see how to achieve this result later. For now, thank you for reading !

3 thoughts on “System design 4 – Inter-Process Communication

  1. renoX May 18, 2010 / 4:22 pm

    A few questions:
    – typed data stream: do you check that the data provided are really of this type?
    Or the type is just a tag associated with the data, but there is no promise from the OS that the data really are of this type?

    – you wrote “All this is unneeded on desktop computers : since the RAM is freely available to all processors, and shared between them”, note that some AMD CPUs have several memory bus so RAM access can be more expensive depending on which CPU access which memory bank.. If the number of CPU on desktop system continue to increase, then it’s likely that the ‘memory sharing’ will become more and more artificial in the future (ie local memory access and remote memory access even if transparent is costly).

  2. Hadrien May 19, 2010 / 3:48 pm

    On type check : If you consider the C language family, there’s no way I know of to say, for sure “that data is of a given type”. Casted data is considered by the compiler as being of the cast type, even for dynamic type identification.

    However, on the library side, I think you can, using templates, say “this is a stream of type X”, and make sure that said stream…
    -> Does not send more than sizeof(X) bytes (care must be taken, as usual, when serializing pointers).
    -> Causes compilation errors when used together with data of an improper type without cast.

    This allows better reliability, even if data is casted. Here I’ll show why.

    Let’s suppose that data of type 1 is 8 bits long and data of type 2 is 48 bits long. Process C is a process waiting for data of type 2. Process A and B send him data.

    The scenario is that process A is buggy, and sends type 1 data instead.

    1/Through a typed stream
    ————————
    An error should have occured when compiling process A, because it cannot send type 1 data through a type 2 stream. Let’s suppose that the developper is stupid enough to ignore the warning, puts a cast, and compile and run its program.

    Process A sends casted type 1 data. So it sends 8 bits of type 1 data, and 40 bits of junk located next to it in memory. Process B sends type 2 data, as usual.

    Process C first reads data sent by process A. It reads garbled type 2 data, but type 2 data nonetheless, so 48 bits of data. Then it reads data sent by process B. 48 more bits of data. The 96 bits of data in the stream have been read, so process C has nothing left to read.

    2/Through an untyped stream
    —————————
    No warning issued during compilation of program A, so the developer knows nothing about his mistake.

    Process A sends data of type 1, hence it sends 8 bits of data. Process B sends data of type 2, 48 bits of data. Then process C comes. It reads 48 bits of garbled data (8 bits from process A, 40 bits from process B). Then, depending on stream implementation, it…
    1/Does nothing, thinking that the remaining 8 bits are some type 2 data that’s not yet received.
    2/Reads 8 bits from the stream then 40 bits of junk.
    3/Crashes.

  3. Hadrien May 19, 2010 / 4:17 pm

    About sharing : I don’t think that CPU cores will grow in number for a long time in the desktop area.

    More cores = less power per core or increased die size.
    1/Reducing core power allows better power management. However, it means that processes which use only 1 threads (the vast majority of today’s desktop software) will suffer from reduced performance. So until a lot more software has turned multicore-friendly, which is not going to happen soon, this solution will not be preferred.
    2/Increasing die size sounds like a more sensible option. However, let’s not forget that electrical information cannot travel faster that some physical limit. Hence we’ll more or less end up having to make cores independent for speed reasons, since they can’t communicate with each other nor with some central memory. The only way to overcome this issue is to have processors behave like independent computers on a slow network, including for system calls. Distributed operating systems working on those architectures are far beyond the acceptable complexity for an average desktop computer.

    So at some point, processor manufacturers will have to stop adding up cores, and start to introduce some serious, much faster new technology. Or software will have to shrink. I prefer the second option, myself. But anyway, I think that the benefit of sharing (imagine making several copies of an HD video fram !) exceed the drawbacks of its slowness on multiple cores.

    We’ll see, once code is here, which one does perform faster ^^

Leave a comment