The OS-periment’s service model : release candidate

In this system, a lot of system services are provided using the concept of processes which provide services to each other, in a fashion similar to the UNIX “daemon” system. If you want a task to be done, you contact the process responsible for it and ask it to do it. I’ve already mentioned why I believe in this design earlier (security, fault impact minimization, etc). This post is less about the “why should I do that” side of things and more about exposing a potential design, and seeing how well it goes.

In this OS, giving work to do to a process is done using Remote Procedure Calls (RPCs). This mechanism has been chosen because it’s close to usual API calls and as such should offer a familiar interface to developers wanting to “give something work to do”. Language constructs or “stub” library code may be used to make it closer to a normal function call. Daemons are asked to tell the kernel about each allowed remote call (to avoid arbitrary code execution), along with full prototypes of the associated functions (to get rid of stack-based attacks and manage pointer-based function arguments more easily). Again, language constructs may be used to make this “broadcasting” process as simple as writing “broadcast” next to a function prototype.

A difference with usual procedure calls, though, is that in order to prevent deadlocks, remote calls will only exist in a non-blocking form. The calling process just sends a request, and optionally provides a “callback” function that is called when the remote call is over, providing results, and then it may continue its works. Callbacks are nothing but remote calls in the end, so processes are asked to provide a list of its “callback” calls to the kernel in the same way as described above, for the same reasons. Although it is still possible to emulate blocking behaviour in this model to some extent, when there’s really nothing to do until a task is completed (Thread T1 does something, T1 makes remote calls and dies just after, T2 is spawned by callback function, T2 continues work), developers are asked to carefully meditate the implications of such a behaviour whenever they choose to implement it, in order to avoid deadlocks (see the comment thread of this post).

Remote call processing may be done in two ways. The first option is to spawn a new thread in the daemon immediately after the remote call is made in order to handle the task. This is called the “threaded” behaviour. The second option is to spawn an inactive thread (and not create its stack right away, to save RAM), then put it in a “queue”. The daemon then runs tasks from the queue sequentially, one after the other : once a task is done, the next thread is picked up from the queue and run. This is called the “asynchronous” behaviour. As a rule of thumb, the threaded behaviour should be embraced for relatively independent operations with a significant amount of processing and a low amount of synchronization, as it scaled better across multiple cores in such situations, while the asynchronous behaviour should be preferred in remote calls which do nothing but IO or can’t be parallelized efficiently due to a high amount of synchronization, as it simplifies programming and may in some case even have better performance.

Optional extensions may be added to the remote call mechanism in order to achieve more complex behaviours when they are needed. Currently envisioned extensions are :

  • Timeouts– If a call has not been completed after a certain amount of time, the remote call is considered as failed. The remote thread is destroyed, the daemon is asked to perform a full clean-up of all internal state that may have been modified by the faulty thread, and the calling process is informed of the failure either through a special “failure” callback or its usual result callback with special parameters indicating failure. Although it may greatly improve the system’s fault resistance, this mechanism is optional because
    • It requires daemon manufacturers to write a “clean-up” function that may be very cumbersome to create and requires state saving functionality that may greatly hurt performance.
    • In some situations, estimating the amount of time required to perform a given task, even coarsely, may be very difficult. This is particularly the case for processing tasks based on a highly variable amount of data that cannot be known in advance.
  • Cooldown– In some situations, a remote call may be performed very frequently, and it may not matter if it is actually performed or not. Typical use case is when a “notification” call is emitted each time some data is modified and results in a user interface being updated, with no information about which modifications have been done included in the call parameters. In that case, to reduce the amount of processing and avoid voluntary or not DoSing of daemons, a “cooldown” delay may be used. It works as follows :
    • If two calls are made within a delay smaller than the cooldown delay, the second call is put on hold, and all subsequent calls are ignored.
    • Once the cooldown delay has elapsed, if a call has been put on hold, it is run.
  • Out of order execution – For asynchronous operation, the RPC mechanism normally ensures that tasks are performed in the order in which they come in, simultaneous requests being serialized by a fair dice mutex roll, and the daemon has no access to its pending task queue. This enforces more intuitive task processing behaviour. However, in some situations (typical use case being a HDD driver), the order in which RPC calls are processed may not actually matter, and processing tasks out of order may result in an appreciable performance gain. In that case, the daemon may enable out of order execution for one of its remote calls. This results in a system-wide visible flag for all processes that may make the call, and in the process having access to its pending task queue and being able to re-order it
    (Note : The process may not use this access and implement its own out of order execution mechanism, whenever this is better suited to the problem).
  • Security checks – Considering that this OS’ philosophy is that as much as reasonably possible, software should only have access to the system capabilities they need, checking if a process has the right security tokens in order to do something is a task which will be frequently performed. RPC calls sound like the right place to do such checks, because they will be the preferred way to access system services. As such, it only seems natural to add security token checks to the RPC system. It would work as follow : when broadcasting function prototypes, daemons may restrict which kind of processes will have access to the broadcasted functionality by asking the kernel to check for the presence of specific security tokens when a remote call is made. Besides this “binary” token functionality (processes either have or don’t have the token), some situations will require more fine-grained token checks. As an example, filesystem access tokens may specify that a process only has access to some regions of the filesystem. In that case, daemons may request from the kernel to receive the token as an additional RPC call parameter for further examination.

More extensions may be added as development continues or with future releases of the operating system.

And that’s it. This post is the last viability testing of my RPC-based model. If no major issue are found through peer review at this point, I will begin implementation.


4 thoughts on “The OS-periment’s service model : release candidate

  1. Hadrien May 29, 2011 / 2:57 pm

    Note that this OS is made for local operation only, not for distributed computing, and that I am using nonblocking calls. With this in mind, here are my answers to the PDF’s criticism.

    “Who is the Server and Who is the Client?” => Indeed, for this example an RPC model doesn’t seem fit. Classical pipe-based stream processing would work better. But we’re far away from the usage scenario of a system API…

    “Unexpected Messages” => Solved through nonblocking calls and callbacks. If the client has an “error callback” set up, the server may inform it this way. Cf the “Timeout” extension example above.

    “Single Threaded Servers” => This is a bit UNIX-specific again, but I’ll try to get used to it. In case of a nonblocking architecture like ours, a major difference is that if a client reads from an empty pipe, an exception is triggered. The proper way to wait for I/O in this OS is to request a callback to be sent when data arrives. Let’s see if it solves this problem…
    -Client 1 has file server set up a callback for incoming data events in a pipe (this is quick, no I/O wait involved).
    -Client 2 asks server to read data.
    -Server reads data, notifies Client 2 via callback when it arrives.
    -Hours later, data for Client 1 arrives.
    -Server sends callback to Client 1.
    Obviously, Client 1 waiting for data has not penalized Client 2’s data, yet this can work with a single-threaded async design.

    “The Two Army Problem” => Distributed system problem, not an issue on a local OS.

    “Multicast” => Solved through nonblocking RPC, which allows to send lots of remote calls without waiting for replies. The “file server wanting to tell all the processes holding part of a modified file to purge their caches” example is typically managed by sending lots of callbacks to all the processes without waiting for them to acknowledge the callback or even needing to.

    “Parameter Marshalling” => A variable number of arguments is a stack-smashing attack waiting to happen, so it’s not allowed. Variable-sized arrays, tuples, etc… will be dynamically allocated, and can as such be managed through shared memory + proper pointer handling (shared memory is not an issue on local OSs, though it would indeed become one if I wanted things to go distributed).

    “Parameter Passing” => Shared memory is the trick again. It is normal that Tanenbaum does not mention it, as it is not suitable for distributed operating systems. But I am writing a local OS, again…

    “Global Variables” => Daemon code has access to the global variables of the daemon. Tanenbaum’s idea sounds like taking client code and shoehorning it into the server without even trying to recompile it (which would fail due to the absence of global variables). I can’t see why this should even be supposed to work.

    “Timing Problems” => Not a problem on a local OS. Again, this shows me that this OS would perform badly as a distributed one, but that’s really not its goal…

    “Exception handling” => I agree with Tanenbaum that the timeout option could be the right one. His considerations about things which can fail remotely but not remotely are not a problem here, because code is made to run locally and won’t run remotely.

    “Repeated Execution Semantics” => The suggested timeout mechanism includes a server cleanup procedure. This makes coding servers harder, but solves the problem.

    “Loss of State” => Now this is not about a specific RPC call failing but about the whole daemon failing and losing its state, and I think this problem is certainly not RPC-specific. Anyhow, this is a harder problem to solve, but it can be done at the cost of some performance hit by keeping an up-to-date copy of the server’s state in a safe location, and updating it each time an RPC is successfully completed. This, in combination with the “thread state cleanup” thing, should allow to restore server state except for faulty threads.

    “Orphans” => Dealt with the ugly way : server performs the computation until it is done, and then notice that the client is not here to receive the callback and simply drops it. Another option is have a server cleanup procedure, again, as it allows to cleanly kill any thread linked to the faulty client.

    “HETEROGENEOUS MACHINES” => This whole section can be skipped for a local OS.

    “Lack of Parallelism” => Addressed through non-blocking calls.

    “Lack of Streaming” => Streaming can be achieved if the RPC call is of the “void DoOperationAndWriteInThePipe(Pipe& dest);” kind, with an “incoming data” callback set on to the pipe.

    “Bad Assumptions” => Again, here I’m talking about code that is written to be run remotely, not client code that is shoehorned into running on a server process.

  2. Alfman June 3, 2011 / 5:33 am


    That’s an interesting paper, but it’s showing its age. I’m surprised at how many assumptions they made which aren’t true today.
    Like: Did they really not have non-blocking RPC in those days?

    If that were a recent publication I’d probably try to rebut it’s claims.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s