Skip to content

On RPC and its integration in programming languages

June 17, 2013

My current investigation of the Ada programming language has led me to an interesting discovery: there is such a thing as programming languages with integrated InterProcess Communication (IPC) functionality. In the case of Ada, that functionality is based on a combination of Remote Procedure Calls (RPC) and distributed objects, and designed for network transparency. Apparently, Ada’s designers also reached the conclusion that message-based communication is the Assembly of IPC in that using it is both error-prone, frustrating, and time-consuming. This discovery led me to think a bit about what I’m trying to achieve with RPC, and whether I’m doing it the way I should.

The point of IPC in general

There are multiple situations in which independent software processes must communicate with each other, such as…

  • …when tasks must be spread out across multiple computers due to hardware specialization (e.g. factory control program directing the work of multiple autonomous embedded systems)
  • …when a task, implemented by a given process, gets input from and/or sends output to an unrelated process (e.g. any modern userspace program communicating with the system’s I/O facilities)
  • …when the functionality of a large software system is broken up across multiple smaller processes, so as to shrink the purpose and liability of each process, increasing security and reducing crash impact (e.g. traditional UNIX programming, microkernel operating systems)

The family of technologies enabling controlled communication between processes in such settings is called “InterProcess Communication”, or IPC for short. Since they partially break process isolation, implementing them on the local scale requires some assistance from the operating system. The OS can, however, only provide a set of simple low-level communication primitives, and leave higher-level communication protocols to user-mode libraries.

The minimal functionality required for IPC is a way for processes to exchange data packets. Such a functionality can itself be provided through a simple “pipe” mechanism allowing applications to exchange streams of bytes, or through more advanced “message passing” mechanisms like D-BUS which allow data to be handled in a more structured way. Since constantly polling the input buffers for new data is inefficient, such data links are generally coupled with a signalling mechanism allowing the OS to notify processes when they receive data. And with these two mechanisms, one already has pretty much the whole POSIX IPC API covered, if we keep aside some optimizations for local IPC based on the existence of shared hardware between processes.

When RPC comes into play

Not everything is cleanly expressed in terms of data transfers between processes, however. If we invoke some of the examples provided above, the factory control program doesn’t really want to transmit data to the embedded systems, it wants to make them perform some action. Same for user space programs calling system software, which do not only want to send some data to the system I/O facilities, but also seek to have them perform some tasks with said data (such as printing text on a console). Message passing mechanisms alone cannot express such “call this function of that process” imperative intents, called Remote Procedure Calls or RPC, which essentially leaves two possible options:

  • Implement a protocol for RPC across a message-passing link, and leave it up to the communicating processes to make sure that the data being sent and received is actually valid
  • Directly integrate mechanisms for RPC into the operating system’s IPC primitives, alongside or in place of traditional message passing

The trade-off here is one of implementation simplicity of either the operating system or user-space programs. RPC is a relatively high level IPC primitives, and supporting it at the OS level requires a more complex kernel, which in turn can be a source of security flaws or reliability issues. However, if there is no standard OS solution for RPC, user-space programs have to implement it in an ad-hoc fashion, which means either dealing with uncanny “generic” library constructs, lots of computer-generated wrapper code, or tons of effort duplication as every communicating process wraps its own RPC mechanism in his own way.

If, on the other hand, if programming language themselves supported standard semantics for RPC over message passing channels, then this tradeoff might be avoided, at least for single-language projects. Which leads me to write a bit about Ada…

Language-provided RPC constructs: the Ada 95 way

When the 1995 revision of the Ada programming language standard was devised, one of its design goals was to ease the development of distributed systems. More precisely, a goal was to be able to start from a non-distributed program, make minor change to its modules and adjust compiler configuration so as to separate it in isolated “partitions” (similar in spirit to OS processes), and then transparently deploy those partitions on multiple machines, without losing core local development assets such as language constructs or type safety in the process.

For this scenario to work, code from a given Ada package had to be able to call functions of another Ada package, even when said packages are implemented in two different processes running on two different machines. Consequently, the Ada designers needed to support a form of RPC, and more specifically one that is not only able to work on a local machine but also over a network. Since networked machines in a distributed system may be linked together in a wide variety of way, the chosen RPC solution also had to be link-agnostic. Tasks such as locating the various components of a distributed system, establishing connections between them, and keeping these connections alive, should therefore not be part of the final standard but rather be left to third-party library implementations.

The end result of this effort is Annex E “Distributed Systems” of the Ada95 reference manual. People interested in the design process behind it, which I quickly alluded to above, can also refer to the appropriate section of the Ada95 rationale. Effectively, what this annex does is to define a set of compiler pragmas that are used to define the public interface to each partition of the distributed system, so that the compiler may automatically generate RPC stubs and other communication primitives and generate errors when unauthorized communication is attempted between partitions. It also specifies an interface to the aforementioned link layer, called the “Partition Communication Subsystem”, which the generated RPC stubs will call to perform the actual cross-partition communication, in a message-passing fashion. The actual link layer being used is implementation-dependent, a commonly used implementation being Adacore et al.’s PolyORB.

All in all, I think that what the Ada language designers achieved here is pretty neat. They have managed to abstract away much of the complexity of distributed system development, and to cleanly separate communication concerns (which are up to the Partition Communication Subsystem) from system design concerns (which is performed through the set of compiler pragmas specifying the remote interface to each partition). They also tackled more advanced issues in the way, such as dispatching calls (when a RPC function pointer is dereferenced or an abstract object method is invoked) or easy partition layout reconfiguration (packages may be reassigned from one partition to another without recompilation).

Their work, as designed, is surely specific to the realm of distributed system, in that the code to both endpoints of an RPC call must be available at compile time. But there is no theoretical reason why such a system could not be extended into a full RPC implementation which is also capable of dynamic binding to one or several foreign targets. In my opinion, what this does is to showcase the power of language-integrated RPC in single-language software project. To show this, I will now discuss how language-agnostic RPC mechanism work…

CORBA and other language-agnostic RPC mechanisms

Let’s first smash an open door by stating that programming languages can be very different from each other. There is, as an example, little in common between Assembly and Python, even though both languages can be translated to machine code and be executed on a computer. This raises the question of how a language-agnostic RPC system should be designed. As a first step, since different languages have different feature sets, one has to wonder how to handle features which are present in a programming language, but missing in another. From this issue, two extreme design approaches emerge:

  • A “minimalist” design, which tries to follow a subset of all languages’ feature set so that no wrappers are required
  • An “all-inclusive” design, which tries to follow a superset of all languages’ feature set and produces wrappers for missing features in each language

Of course, neither of these options is very practical in the real world. A truly universal minimalist design would be extremely limitating, since it would have to force the extremely restricted feature set of very low-level languages like Assembly upon developers of higher-level languages. Conversely, a true all-inclusive RPC design would completely handle the hassle of adapting to another languages’ construct, at the cost of a design that is so complex that it cannot effectively be implemented.

Consequently, most language-agnostic RPC designs end up somewhere in the middle, by defining an “sufficiently complete” abstract feature set of the RPC interface, providing wrappers for popular languages which do not implement this full feature set, and ignoring less popular languages altogether. Said feature set typically includes synchronous and/or asynchronous function calls, a type-safe parameter marshalling mechanism, and some object-orientation glitter such as a type system allowing for dispatching function calls.

After that comes the issue of type incompatibilities between languages. For a trivial example, arrays and string implementations vary tremendously from one language to another, such that the associated data structures cannot be safely passed around from one language to another, and have to be converted. To simplify this conversion, language-agnostic RPC libraries generally define their own type system, and a system for converting data to and from that system. Needless to say, this double conversion to an “language-agnostic type” is horribly inefficient, making these RPC mechanisms useless for any kind of high-performance task, even though single-language RPC on a local machine could theoretically work well with extremely few data conversions.

Reconsidering the TOSP approach to RPC

So far, I did not plan to address the issue of communication between programs written in different languages in TOSP’s RPC system. I provided an abstraction for type-safe single-language function calls across process boundaries, that allowed to a minimal level of compatibility between service version N and service version N+1, and that was about it. Obviously, looking back at it now, I feel that there is a need to go back to the design table and work more on this RPC abstraction.

What, exactly, do I want to achieve with RPC? Here are the main criteria which I see for RPC between client applications and system services and between multiple parts of a large client application:

  • Programs which are written in the same programming language can transparently call functions of each other, with very good performance (no data conversion, as little syscalls as possible), across process boundaries
  • Programs which are written in different supported programming languages can also communicate in a relatively seamless way, although reduced performance and some type conversion hurdles are acceptable
  • I’m only interested in local RPC operation and want to take advantage of all the optimizations that come with it
  • It should be easy to write RPC bindings for other languages, so that users of obscure languages don’t come yelling at me for favoring the small bunch of languages which I am comfortable with
  • If a language already has an existing framework for RPC or communication with functions written in other language, there should be a way to leverage this functionality instead of reinventing the wheel

Asynchronous calls are very important, but can potentially be built using synchronous calls that return instantly and behave in a standard way (for errors, completion monitoring, results…).

With this in mind, here are some thoughts on how I should tweak the existing TOSP RPC design, or dramatically change it altogether:

  • Provide message passing IPC primitives at the kernel level, so as to leverage the built in message-based RPC primitives of certain languages and RPC libraries
  • Decide on a list of programming languages that will be officially supported by TOSP, for which RPC primitives are to be provided
  • Precisely define the desired feature set of a TOSP RPC primitive for every supported language, discuss the pros and cons of having it as a kernel primitive at all
  • Build wrappers for communication between languages which are not designed to communicate with each other. These libraries are to be used on the receiving end of an RPC call, i.e. the sender shouldn’t have to care which programming language the receiver is written in unless it explicitly wants to.
  • Ponder using Ada for the bulk of the system codebase, as it provides better interfacing with other languages than C/++ and has built-in RPC primitives (among other niceties)
About these ads

From → OS development

18 Comments
  1. Brendan permalink

    One of the things we’ve learnt about distributed systems is that it’s a bad idea for communication to be implicit; as this makes it excessively difficult for programmers to reason about performance trade-offs. For explicit communication, the programmer knows which things are fast (e.g. function calls) and which things are slow/high latency (e.g. involving network latency) and can design software for the differences; and for implicit communication the programmers hands are tied (there’s little they can do to mitigate communication overheads). The other problem with implicit communication is sane error handling (e.g. someone unplugging a network cable at the wrong time) – it takes special care to handle this correctly; and this “special care” can’t exist for implicit communication.

    The other thing we’ve learnt about distributed systems is that synchronous IPC (including RPC, which really is “synchronous IPC pretending to be function/procedure calls”) is bad. It’s excessively difficult to get good concurrency from it without over-complicating things (e.g. without creating a local thread to perform the synchronous IPC and dealing with the problem/s of threads and locking).

    Also note that in theory you can use synchronous IPC to emulate asynchronous IPC; but in practice it’s a huge disaster. It means that transport layers (the sending and receiving kernels, the networking stack, etc) have to be aware of the expected behaviour. For example; if a synchronous message from one machine to another expects a reply or not, for each possible type of message; such that the transport layer/s can know how to handle things like delivery failures and if a “failed to deliver” response is required (synchronous) or not (asynchronous). The reverse (using asynchronous IPC to emulate synchronous IPC) is also a bad idea, for different reasons (it completely destroys all the advantages of using asynchronous IPC).

    In my opinion the “least imperfect” solution is the actor model (see http://en.wikipedia.org/wiki/Actor_model ). More, specifically; a collection of “actors” (threads, processes) communicating via. asynchronous messaging, with explicit “send_message()” and “get_message()” for asynchronous messaging with no constraints on message delivery order. This leads to a different programming style; where each “actor” ends up being a message handling loop and state machine. This makes it much easier to develop software, because all the problems of concurrency disappear (e.g. there’s no need for any locking, no locality issues, etc). In addition; it’s very easy to do things like unit testing (e.g. test actors in isolation), or to provide redundancy (e.g. multiple actors doing identical work in case one fails; including comparing the output of 3 or more identical actors to detect if/when one is providing wrong results), or to find bottlenecks (find the actor/thread that is consuming the most CPU time). It can also be easily added to any language – it’s just a few extra send/receive functions.

    The only real downside to this is that software needs to be designed for it – it’s not like RPC where you can retro-fit it to existing “single-task” design (e.g. by hiding it in libraries). Of course this can also be seen as an advantage (e.g. forcing software developers to make software that doesn’t suck for the distributed case).

    – Brendan

  2. Alfman permalink

    Brendan,

    “One of the things we’ve learnt about distributed systems is that it’s a bad idea for communication to be implicit”

    Do you have a specific example in mind to help clarify what you mean by implicit communication?

    “The other thing we’ve learnt about distributed systems is that synchronous IPC (including RPC, which really is “synchronous IPC pretending to be function/procedure calls”) is bad. It’s excessively difficult to get good concurrency from it without over-complicating things”

    Well, there’s no reason to say it’s “pretending to by function/procedure calls”, since some languages make RPC identical to local functions. The consumer shouldn’t really care whether a function is local or not, and this is generally good for RPC.I do agree with the point about concurrency; it sucks having to spawn unwanted threads just to work around blocking calls, but again this would be true whether the blocking call is a local one or RPC. For example, the vast majority of system library calls (ie gethostbyname) are unfortunately already synchronous. Our desire to call these functions asynchronously would be true without regards to whether it was implemented in the local process or whether it involved IPC to another specialized processes.

    .net has some good ideas with how it incorporated SOAP web services. Every SOAP service gets 2 versions of the functions so that all SOAP functions can be called via either synchronous or asynchronous mechanisms, giving the caller the flexibility to choose either one.

    “In my opinion the “least imperfect” solution is the actor mode”

    I’m a big fan of the asynchronous actor model too. I don’t know exactly why, but it’s much more prevalent in windows world and it’s been the norm in languages like visual basic since it’s inception. On the unix side, we see alot more code that was built with blocking calls instead.

  3. One of the things we’ve learnt about distributed systems is that it’s a bad idea for communication to be implicit; as this makes it excessively difficult for programmers to reason about performance trade-offs. For explicit communication, the programmer knows which things are fast (e.g. function calls) and which things are slow/high latency (e.g. involving network latency) and can design software for the differences; and for implicit communication the programmers hands are tied (there’s little they can do to mitigate communication overheads). The other problem with implicit communication is sane error handling (e.g. someone unplugging a network cable at the wrong time) – it takes special care to handle this correctly; and this “special care” can’t exist for implicit communication.

    First of all, please note that my goal here is not to build a networked distributed system, but to use the RPC mechanism to enable more seamless IPC on the local scale. A core target being communication between user applications and system services, with better communication between user applications being a natural side-effect.

    It seems to me that this situation is quite different from the networked scenario regarding those key points which you mention:

    • As for IPC latency, you will have it whether you use RPC or a custom library-based interface, since sandboxing system services involves putting them in a separate process anyway. In this scenario, RPC can actually makes it easier to tell which call is remote and which is local, since remote calls now have standard semantics unlike with custom libraries
    • Communication error handling can easily be done by the RPC wrapper throwing an exception, as in Ada. In languages like C which do not support exceptions, they can be emulated through specification of an “error handler” procedure at call time.

    The other thing we’ve learnt about distributed systems is that synchronous IPC (including RPC, which really is “synchronous IPC pretending to be function/procedure calls”) is bad. It’s excessively difficult to get good concurrency from it without over-complicating things (e.g. without creating a local thread to perform the synchronous IPC and dealing with the problem/s of threads and locking).

    Also note that in theory you can use synchronous IPC to emulate asynchronous IPC; but in practice it’s a huge disaster. It means that transport layers (the sending and receiving kernels, the networking stack, etc) have to be aware of the expected behaviour. For example; if a synchronous message from one machine to another expects a reply or not, for each possible type of message; such that the transport layer/s can know how to handle things like delivery failures and if a “failed to deliver” response is required (synchronous) or not (asynchronous). The reverse (using asynchronous IPC to emulate synchronous IPC) is also a bad idea, for different reasons (it completely destroys all the advantages of using asynchronous IPC).

    On the local scale, synchronous RPC should cause a blocking of a fraction of a milisecond, where any extra latency would be added by processing within the body of the remote call and could be alleviated by making it “half-synchronous” (i.e. return immediately while doing processing in a separate thread).

    For any local IPC application which I can think of, such low latencies would be enough, since the most stringent latency constraints that are typically found on a desktop machine are user-facing latencies which should stay around 10ms. This makes me think that the use of synchronous RPC primitives is only a problem for networked systems, which again I do not target.

    “Pure” asynchronous RPC sure can be done (it has, again, been done in the RPC Ada subsystem), but it has a number of undesirable properties, such as poor communication error locality, which would make it undesirable unless one is absolutely forced to use it, in my opinion. So I would prefer not to go in that direction if I can avoid to.

    But if the need does emerge anyway for some specific application, it’s not so hard to start with synchronous calls and add true asychronous RPC capabilities later for those applications that actually need it.

    In my opinion the “least imperfect” solution is the actor model (see http://en.wikipedia.org/wiki/Actor_model ). More, specifically; a collection of “actors” (threads, processes) communicating via. asynchronous messaging, with explicit “send_message()” and “get_message()” for asynchronous messaging with no constraints on message delivery order. This leads to a different programming style; where each “actor” ends up being a message handling loop and state machine.

    That such explicit messaging makes development easier is highly debatable. In fact, I’ve recently seen someone call it the Assembly of interprocess communication, in the sense that unless you absolutely need extreme performance, you should stay as far away from it as possible.

    The argument was that as with every low-level programming methodology, explicit messaging will force you to reimplement everything which a higher-level primitive can do for you. While an RPC primitive can easily encompass such important concerns as type safety, exceptions, synchronization, pointer handling, and data serialization and interchange in a standard way, explicit messaging forces one to reimplement most of this by hand, much like an Assembly developer can enjoy none of the useful primitives of a modern programming language and has to reinvent everything from scratch.

    Of course, high-level IPC primitives can themselves be built on top of a message-based communication paradigm like the actor model, by simply having the RPC senders and handlers be designed like actors. This way, you get most of the advantages, without the hurdle of designing and implementing your own communication protocol on top of low-level messaging.

    As an example, Ada’s RPC design is based on an underlying implementation-defined “communication subsystem”, which can transmit streams of ordered data across process. The compiler then transparently produces stubs which send calls and their parameters across the stream on the “client” side, and an event loop which listens to the stream for incoming RPC calls and runs them on the “server” side.

    In my opinion, direct use of message-based IPC is best left to situations where all a program does is to take raw data as input, and always perform the same kind of processing on it before sending the result as output. People who have worked with LabView programs can tell that such dataflow-oriented programming is not very suitable for general-purpose software development, however, and would be best left for some niches such as hardware emulation.

    This makes it much easier to develop software, because all the problems of concurrency disappear (e.g. there’s no need for any locking, no locality issues, etc). In addition; it’s very easy to do things like unit testing (e.g. test actors in isolation), or to provide redundancy (e.g. multiple actors doing identical work in case one fails; including comparing the output of 3 or more identical actors to detect if/when one is providing wrong results), or to find bottlenecks (find the actor/thread that is consuming the most CPU time). It can also be easily added to any language – it’s just a few extra send/receive functions.

    It’s just as easy to do unit testing with well-designed RPC as it is within a single program, and easier to do it in this context than when explicitly implementing the actor model since you don’t have a communication layer to develop and test.

    You are right that in an RPC setting, redundancy must be explicitly supported by the RPC subsystem rather than be available for any developer that wants it, however I’d also argue that this can be a strength since a single implementation of redundancy checks can be shared across thousands software, unlike with explicit messaging where everyone has to do his own thing.

    Regarding bottlenecks lookup, I’d need an example of how you would do this with explicit messaging before I can tell if this is truly an advantage of this communication method over higher-level IPC primitives.

    The only real downside to this is that software needs to be designed for it – it’s not like RPC where you can retro-fit it to existing “single-task” design (e.g. by hiding it in libraries). Of course this can also be seen as an advantage (e.g. forcing software developers to make software that doesn’t suck for the distributed case).

    So in effect, you want every software developer to suffer the hurdles of HPC networked application development, even when all they want to write is an image editor or a word processor?

    Sounds like overkill to me, in the context of TOSP, when considering that I can’t see a use case for networked distributed systems that matches this project’s goals.

  4. Alfman permalink

    Hadrien,

    “Pure” asynchronous RPC sure can be done (it has, again, been done in the RPC Ada subsystem), but it has a number of undesirable properties, such as poor communication error locality, which would make it undesirable unless one is absolutely forced to use it, in my opinion. So I would prefer not to go in that direction if I can avoid to.

    In general, aren’t the exceptions thrown normally at the returning async function call? Is this different from how Ada does it?
    hnd_a = function_a_begin(x,y,z); // async request
    hnd_b = function_b_begin(x,y,z); // async request

    try {
    a = function_a_end(hnd_a);
    b = function_b_end(hnd_b);
    } catch(…) {}

    The two functions are called in parallel, and the “end” functions deterministically throw errors at their respective positions in the code.

    Also, C# introduced a powerful async construct allowing to write async code as though it were synchronous. This feature is surely going to make you wish more languages had it.

    http://www.interact-sw.co.uk/iangblog/2010/11/01/csharp5-async-exceptions

    But if the need does emerge anyway for some specific application, it’s not so hard to start with synchronous calls and add true asychronous RPC capabilities later for those applications that actually need it.

    I’m not clear exactly what you’re saying above. In your post you wrote:

    Asynchronous calls are very important, but can potentially be built using synchronous calls that return instantly and behave in a standard way (for errors, completion monitoring, results…).

    Are you merely reiterating that you can always implement asynchronous functions in the future using the existing synchronous calls? Or are you suggesting that you might add asynchronous calling in the future? The reason I ask is because I think there is a distinction to be made. In the first case, the service provider would be responsible for explicitly creating an asynchronous version of the function. In the second case the caller would be able to choose to call any function synchronously or asynchronously without any explicit support from the service provider.

  5. In general, aren’t the exceptions thrown normally at the returning async function call? Is this different from how Ada does it?
    hnd_a = function_a_begin(x,y,z); // async request
    hnd_b = function_b_begin(x,y,z); // async request

    try {
    a = function_a_end(hnd_a);
    b = function_b_end(hnd_b);
    } catch(…) {}

    The two functions are called in parallel, and the “end” functions deterministically throw errors at their respective positions in the code.

    Essentially, Ada provides no way for asynchronous calls to return results or throw exceptions that are sent back to the caller. Ada’s asynchronous procedures are a pure fire and forget mechanism : give a procedure some work to do, and forget about it forever. If you want some kind of handle-based way to manage asynchronous results, as described above, then it has to be implemented explicitly on the client or server side.

    Also, C# introduced a powerful async construct allowing to write async code as though it were synchronous. This feature is surely going to make you wish more languages had it.

    http://www.interact-sw.co.uk/iangblog/2010/11/01/csharp5-async-exceptions

    Indeed, this is quite a beatiful clean and concise way to express the complexities of asynchronous programming.

    It does leave me wondering, however, about programmer’s control on when callbacks are executed. Does the programmer explicitly start an event loop in some thread, or are callbacks susceptible of interrupting the code which started an asynchronous wait at any time without a warning?

    The latter option (which is, if memory serves me well, how C# delegates work), has a potential to cause some serious priority inversion problems if the caller code has a performance-constrained region that must run as quickly as possible without interruption. In this scenario, firing up asynchronous calls, doing the heavy processing, and only then handling incoming events sounds like the most sensible option, but excessive abstraction can prevent one from doing things this way.

    But if the need does emerge anyway for some specific application, it’s not so hard to start with synchronous calls and add true asychronous RPC capabilities later for those applications that actually need it.

    I’m not clear exactly what you’re saying above. In your post you wrote:

    Asynchronous calls are very important, but can potentially be built using synchronous calls that return instantly and behave in a standard way (for errors, completion monitoring, results…).

    Are you merely reiterating that you can always implement asynchronous functions in the future using the existing synchronous calls? Or are you suggesting that you might add asynchronous calling in the future? The reason I ask is because I think there is a distinction to be made. In the first case, the service provider would be responsible for explicitly creating an asynchronous version of the function. In the second case the caller would be able to choose to call any function synchronously or asynchronously without any explicit support from the service provider.

    My point is to make a distinction between asynchronous processing (RPC call synchronously waits for the server to generate and return an asynchronous handle, then returns it) and asynchronous calling (RPC call generates a handle on client side, sends instructions to the server, and returns the handle without waiting for server acknowledgement).

    Both ways of doing things can be expressed with similar syntaxic sugar and automatically generated code stubs, as from the client point of view, you always have an RPC call that returns a handle, and differences in where a handle is generated and whether the server acknowledged the call or not are largely irrelevant.

    However, asynchronous processing improves error locality at the cost of a extra call latency, because at the time where a handle is returned, it is already known by the RPC subsystem whether the server has actually received the call and started processing it or not. Consequently, the RPC subsystem does not lie to the client by successfully returning from an RPC call that actually never got anywhere.

    Thus, I would argue that asynchronous processing is more desirable than asynchronous calling in local scenarios, where call latency is negligible. Conversely, when call latency becomes very high as in networked systems, Brendan is right that the performance vs correctness tradeoff will probably reverse in favor of asynchronous calling.

  6. Alfman permalink

    Hadrien,

    My point is to make a distinction between asynchronous processing (RPC call synchronously waits for the server to generate and return an asynchronous handle, then returns it) and asynchronous calling (RPC call generates a handle on client side, sends instructions to the server, and returns the handle without waiting for server acknowledgement).

    Both ways of doing things can be expressed with similar syntaxic sugar and automatically generated code stubs, as from the client point of view, you always have an RPC call that returns a handle, and differences in where a handle is generated and whether the server acknowledged the call or not are largely irrelevant.

    I think I understand.

    To me it makes the most sense to implement async RPC “hand off” in the kernel rather than in the client or server. If you allow the client to specify the handle, it may become necessary for the kernel to use a less optimal handle indexing mechanism than if it returned it’s own handles. This can eliminate possible errors in the client (dup handles, etc).

    However, asynchronous processing improves error locality at the cost of a extra call latency, because at the time where a handle is returned, it is already known by the RPC subsystem whether the server has actually received the call and started processing it or not. Consequently, the RPC subsystem does not lie to the client by successfully returning from an RPC call that actually never got anywhere.

    In some sense ALL function calls are synchronous, and *arguably* the the normal async approach with “_begin” and “_end” is really actually two synchronous calls. However I’m not sure if there’s any benefit to throwing callee originated exceptions at the point of invocation. This implies it needs to return two statuses instead of one: once after invocation, and another after completion. Doesn’t this create more work for the RPC users unnecessarily since they already have to handle errors in the completion call anyways?

    I realize that you don’t care about network RPC for yourself, but it worries me that your RPC implementation would be semantically incompatible with other RPC systems like soap in regards to where exceptions get thrown. In theory someone using your OS might want to replace a built in OS RPC service with one that does a SOAP RPC call. This developer is at an impasse with whether to perform the SOAP RPC synchronously in order to throw exceptions at the right place in your model, or to perform SOAP RPC asynchronously and hope that the client doesn’t break when exceptions are thrown at the wrong place at the completion call.

  7. My point is to make a distinction between asynchronous processing (RPC call synchronously waits for the server to generate and return an asynchronous handle, then returns it) and asynchronous calling (RPC call generates a handle on client side, sends instructions to the server, and returns the handle without waiting for server acknowledgement).

    Both ways of doing things can be expressed with similar syntaxic sugar and automatically generated code stubs, as from the client point of view, you always have an RPC call that returns a handle, and differences in where a handle is generated and whether the server acknowledged the call or not are largely irrelevant.

    I think I understand.

    To me it makes the most sense to implement async RPC “hand off” in the kernel rather than in the client or server. If you allow the client to specify the handle, it may become necessary for the kernel to use a less optimal handle indexing mechanism than if it returned it’s own handles. This can eliminate possible errors in the client (dup handles, etc).

    That is true, if RPC is implemented at the kernel level, then the notion of extra client-side and server-side call processing makes little sense. The kernel should handle things like async handles on its own.

    However, after seeing the proof, through Ada’s distributed systems annex and its implementations, that a clean and seamless RPC abstraction can be implemented using only ordered message passing as an underlying IPC primitive, I’m starting to question the relevance of actually implementing RPC in the kernel at all.

    The question which I’m currently asking myself is, would there truly be any major drawback to only implementing message-passing IPC in the kernel (which will be required anyway for efficient dataflow programming) and building RPC primitives through a combination of libraries and programming language extensions on top of that basic communication layer.

    This does not mean extra complexity for application developers, as the handle generation mechanism would still be offloaded to the library-based RPC implementation. However, in this case, the question of where RPC-related processing is to be done has to be asked.

    However, asynchronous processing improves error locality at the cost of a extra call latency, because at the time where a handle is returned, it is already known by the RPC subsystem whether the server has actually received the call and started processing it or not. Consequently, the RPC subsystem does not lie to the client by successfully returning from an RPC call that actually never got anywhere.

    In some sense ALL function calls are synchronous, and *arguably* the the normal async approach with “_begin” and “_end” is really actually two synchronous calls. However I’m not sure if there’s any benefit to throwing callee originated exceptions at the point of invocation. This implies it needs to return two statuses instead of one: once after invocation, and another after completion. Doesn’t this create more work for the RPC users unnecessarily since they already have to handle errors in the completion call anyways?

    This is not truly a callee originated exception, since the callee has no reason to throw an exception right when the RPC asynchronous call has been successfully dispatched. Here, I’m mostly interested in the question of what should happen if the client invokes an asynchronous call, but the function can actually never be executed on the server side because the server process is dead or otherwise unable to process calls.

    In this case, I question the usefulness of having the client continue to do its own thing, do some extra processing and invoke more RPC calls, only to find out when trying to fetch the failed call’s results that it has never been started. If this information was made available right away, the client would be able to fail cleanly or engage into some kind of fallback procedure without performing (and subsequently cancelling !) extra steps before doing so.

    I believe this is worth dealing with two exception handlers. But I do admit that the aforementioned issue will be raised again in the case where a failure occurs during server-side processing, and that in the case of asynchronous calls, it is unavoidable to have to handle this class of failures at return time. So this approach of handling errors as soon as possible still has its limit.

    I realize that you don’t care about network RPC for yourself, but it worries me that your RPC implementation would be semantically incompatible with other RPC systems like soap in regards to where exceptions get thrown. In theory someone using your OS might want to replace a built in OS RPC service with one that does a SOAP RPC call. This developer is at an impasse with whether to perform the SOAP RPC synchronously in order to throw exceptions at the right place in your model, or to perform SOAP RPC asynchronously and hope that the client doesn’t break when exceptions are thrown at the wrong place at the completion call.

    My understanding of SOAP is that it is a link-agnostic protocol, that assumes a perfectly reliable underlying messaging layer and does nothing to handle communication errors. Consequently, if the underlying network connection fails for some reason (annoying firewall, unplugged wire…), a caller could still have to deal with exceptions occuring at call time. But I have little experience with SOAP implementations, so if you know more, feel free to correct me on this front.

  8. Alfman permalink

    Hadrien,

    The question which I’m currently asking myself is, would there truly be any major drawback to only implementing message-passing IPC in the kernel (which will be required anyway for efficient dataflow programming) and building RPC primitives through a combination of libraries and programming language extensions on top of that basic communication layer.

    Should the kernel export it’s own services via RPC?

    I’m actually having a somewhat difficult time making a meaningful distinction at all. All an RPC function is, fundamentally speaking, is a bunch of parameters plus a targeted event notification (and then something similar in reverse). Regardless of whether we visualize it as “message passing” or “async kernel RPC”, it’s almost always going to take the form of passing a bunch of parameters on the heap or stack along with an “RPC” or “IPC” syscall telling the kernel to pass it along to the target. What distinction is there to be made between these two abstractions when we boil them down to their essense? Can they be two sides of the same coin?

    This does not mean extra complexity for application developers, as the handle generation mechanism would still be offloaded to the library-based RPC implementation. However, in this case, the question of where RPC-related processing is to be done has to be asked.

    In my experience with async code, it’s much easier to work with void* callback parameters rather than int handles because in practice you’ll eventually need to convert the handle to a specific object somehow. When you get an inbound async event notification with a void*, you can go *directly* into to the datastructure that you passed in when you invoked the async request without having to look it up from the int handle.

    Conceptual example:

    struct SESSION {
    int handle;
    const char*ip;
    const char*name;

    };
    struct SESSION session[10000];

    // Spawn off 10k async tcp connections
    for(int i=0; i<10000; i++) {
    session[i].handle = newasynctcpsocket(…,&session[i]); // pass in the structure's own address to be returned on completion
    session[i].ip = strdup(…);
    session[i].name = strdup(…); // irrelevant
    }

    while( int handle = geteventhandle() ) {
    SESSION*s=NULL;
    // the need to lookup our own structure in the client process sucks
    for(int i=0; iname);
    }
    }

    // Alternative approach using void* callback parameter
    while( SESSION*s = geteventptr() ) {
    // handle event
    printf(“Event on %s\n”, s->name);
    }

    The thing is, this approach can only work if the caller trusts the void* callback parameter hasn’t been changed, but in order for it to be trustworthy it has to be the kernel rather than the remote service who’s responsible for handling it. I hope I made this clear enough because I know it’s a very subtle point.

    In this case, I question the usefulness of having the client continue to do its own thing, do some extra processing and invoke more RPC calls, only to find out when trying to fetch the failed call’s results that it has never been started. If this information was made available right away, the client would be able to fail cleanly or engage into some kind of fallback procedure without performing (and subsequently cancelling !) extra steps before doing so.

    On the flip side, I question the usefulness of optimizing code paths which are clearly exceptional in nature, at the expense of breifly blocking all normal code paths :)

    My understanding of SOAP is that it is a link-agnostic protocol, that assumes a perfectly reliable underlying messaging layer and does nothing to handle communication errors. Consequently, if the underlying network connection fails for some reason (annoying firewall, unplugged wire…), a caller could still have to deal with exceptions occuring at call time. But I have little experience with SOAP implementations, so if you know more, feel free to correct me on this front.

    The reason to use async to begin with is to avoid blocking. Wouldn’t this be impossible in the network case if we needed transport errors to bubble up to the point of invokation instead of the point of return?

  9. The question which I’m currently asking myself is, would there truly be any major drawback to only implementing message-passing IPC in the kernel (which will be required anyway for efficient dataflow programming) and building RPC primitives through a combination of libraries and programming language extensions on top of that basic communication layer.

    Should the kernel export it’s own services via RPC?

    Elegant design would call for it, but that is possible whether RPC is implemented in the kernel or in userspace libraries working on top of a kernel-level message passing layer.

    I’m actually having a somewhat difficult time making a meaningful distinction at all. All an RPC function is, fundamentally speaking, is a bunch of parameters plus a targeted event notification (and then something similar in reverse). Regardless of whether we visualize it as “message passing” or “async kernel RPC”, it’s almost always going to take the form of passing a bunch of parameters on the heap or stack along with an “RPC” or “IPC” syscall telling the kernel to pass it along to the target. What distinction is there to be made between these two abstractions when we boil them down to their essense? Can they be two sides of the same coin?

    Well, in essence, if RPC is implemented over a dumb message passing primitive, then a data packet comprising of the call signature, the serialized parameters, and metadata such as parameter types has to be built on the client side and passed around to the server side. If the kernel is RPC-aware, however, some of these steps involved can be avoided if it is safe to do so based on the call context. On the flip side, a more complex kernel is required.

    Finally, note that it is possible to start with a simple message passing based implementation, then move to kernel-mode RPC if performance considerations call for it.

    In my experience with async code, it’s much easier to work with void* callback parameters rather than int handles because in practice you’ll eventually need to convert the handle to a specific object somehow. When you get an inbound async event notification with a void*, you can go *directly* into to the datastructure that you passed in when you invoked the async request without having to look it up from the int handle.

    Conceptual example:

    (…)

    The thing is, this approach can only work if the caller trusts the void* callback parameter hasn’t been changed, but in order for it to be trustworthy it has to be the kernel rather than the remote service who’s responsible for handling it. I hope I made this clear enough because I know it’s a very subtle point.

    In my view, the example which you propose for handle-based call identification is overly convoluted.

    Instead, a client-side wrapper could just store the void* pointer in an “active call” lookup table, and send the associated table index to the server as a call identifier. When the server returns, the lookup table index would then be checked for validity, and if everything is good then the client-side wrapper returns the void* wrapper.

    Which issue would you see with such a manipulation, if it happens transparently in the wrapper without the client code needing to know about it? In practice, that’s also probably how I would do it in the kernel, if RPC was implemented there…

    In this case, I question the usefulness of having the client continue to do its own thing, do some extra processing and invoke more RPC calls, only to find out when trying to fetch the failed call’s results that it has never been started. If this information was made available right away, the client would be able to fail cleanly or engage into some kind of fallback procedure without performing (and subsequently cancelling !) extra steps before doing so.

    On the flip side, I question the usefulness of optimizing code paths which are clearly exceptional in nature, at the expense of breifly blocking all normal code paths :)

    My understanding of SOAP is that it is a link-agnostic protocol, that assumes a perfectly reliable underlying messaging layer and does nothing to handle communication errors. Consequently, if the underlying network connection fails for some reason (annoying firewall, unplugged wire…), a caller could still have to deal with exceptions occuring at call time. But I have little experience with SOAP implementations, so if you know more, feel free to correct me on this front.

    The reason to use async to begin with is to avoid blocking. Wouldn’t this be impossible in the network case if we needed transport errors to bubble up to the point of invokation instead of the point of return?

    In both cases, fair enough :)

  10. Alfman permalink

    Hadrien,

    In my view, the example which you propose for handle-based call identification is overly convoluted.

    Actually, this was kind of the point of the example. When you are using a handle, it becomes necessary to “look up” the object(s) associated with that handle on the client side in some kind of structure. What made my example convoluted WAS the lookup. (The rest of the cruft was just for example to help communicate context of what this lookup piece was doing). When using a pointer no lookup is necessary at all, since it already points to the record you wanted to lookup. I was hoping the example would illustrate the difference, but I see that wordpress botched up that code.

    Instead, a client-side wrapper could just store the void* pointer in an “active call” lookup table, and send the associated table index to the server as a call identifier. When the server returns, the lookup table index would then be checked for validity, and if everything is good then the client-side wrapper returns the void* wrapper.

    If the client were to use an array as a lookup table, that’s a pretty fast lookup. The issue with the array is that we’ve increased the complexity of finding an empty slot. You need an O(n) scan to choose a free slot to use. You’d need yet another structure to make finding a handle an O(1) operation. You’ll probably need more complexity to make a re-sizable array. On top of all this having any structures at all might make it necessary to add mutexes to keep them safe in the multithreaded case (assuming you want the RPC library to be safe in multithreaded code).

    Of course it’s all doable (people have been doing it with select/poll syscalls on linux forever). However do you see how the void* callback parameter can eliminate a handle lookup table? Personally I think it makes async code less convoluted.

  11. Alfman permalink

    Hadrien,
    I think the way I set up the example may have been a bit confusing in the exact context of what you are trying to do. In my example the handle was a property of the session rather than an index into the session table. Maybe you can ignore that aspect of the example since it’s not really relevant to the differences I am trying to point out.

  12. In my view, the example which you propose for handle-based call identification is overly convoluted.

    Actually, this was kind of the point of the example. When you are using a handle, it becomes necessary to “look up” the object(s) associated with that handle on the client side in some kind of structure. What made my example convoluted WAS the lookup. (The rest of the cruft was just for example to help communicate context of what this lookup piece was doing). When using a pointer no lookup is necessary at all, since it already points to the record you wanted to lookup. I was hoping the example would illustrate the difference, but I see that wordpress botched up that code.

    Instead, a client-side wrapper could just store the void* pointer in an “active call” lookup table, and send the associated table index to the server as a call identifier. When the server returns, the lookup table index would then be checked for validity, and if everything is good then the client-side wrapper returns the void* wrapper.

    If the client were to use an array as a lookup table, that’s a pretty fast lookup. The issue with the array is that we’ve increased the complexity of finding an empty slot. You need an O(n) scan to choose a free slot to use. You’d need yet another structure to make finding a handle an O(1) operation. You’ll probably need more complexity to make a re-sizable array. On top of all this having any structures at all might make it necessary to add mutexes to keep them safe in the multithreaded case (assuming you want the RPC library to be safe in multithreaded code).

    Of course it’s all doable (people have been doing it with select/poll syscalls on linux forever). However do you see how the void* callback parameter can eliminate a handle lookup table? Personally I think it makes async code less convoluted.

    I think I understand your point about the complexity of the lookup procedure on asynchronous returns. However, I think that such a lookup procedure will be needed regardless of where RPC is being implemented, be it in a client-side library or in the kernel.

    Let us picture ourselves the asynchronous remote call procedure for a kernel-side RPC implementation.

    1. Client code prepares for RPC and performs a system call to wake up kernel code
    2. Kernel code gets data about the remote call to be performed from the client and checks it
    3. Kernel code starts remote asynchronous procedure on the server side
    4. Once the server is done, it performs another system call for the asynchronous return
    5. Kernel code fetches back data about the asynchronous call from which the server claims to return
    6. Kernel code sends returned results and other optional data (such as the call metadata you propose to add) to the client
    7. When client code blocks asks for the returned value, it gets the data from where the kernel put it at the previous step

    It is my understanding that in this scheme, step 5 will necessarily require some kind of lookup table structure to be performed. Even if the kernel tried to bypass the need for it by keeping the call context alive in a blocked kernel thread (ignoring for the moment all the issue with that blocking approach), that thread would still be blocked waiting for an event to be triggered on the server-side. And since there would be as many possible events as there are pending asynchronous calls, you would still need a lookup table on the event management subsystem side.

    Now, perhaps the scheme which I propose for kernel-side RPC is not general enough, and there is a way to get away without using a lookup table by using a completely different algorithm. But I can’t think of a suitable alternate algorithm. Would you have one in mind?

  13. Alfman permalink

    Hadrien,

    It is my understanding that in this scheme, step 5 will necessarily require some kind of lookup table structure to be performed. Even if the kernel tried to bypass the need for it by keeping the call context alive in a blocked kernel thread (ignoring for the moment all the issue with that blocking approach), that thread would still be blocked waiting for an event to be triggered on the server-side.

    Ah, let me step back a bit. The IPC mechanism used by the kernel would still be using handles to identify the descriptors in kernel syscalls. While in theory a kernel pointer could be used, that just wouldn’t be secure at all. So the kernel will always have to look up the process handles in order to perform the IPC between processes. But userspace processes don’t have to do any lookups.

    So long story short, the idea behind the callback pointers is to avoid having lookup tables in userspace in addition to the one in the kernel. The userspace callback pointers would actually be stored in the kernel’s handle table. When an async notification is triggered it would be passed into the userspace process along with the rest of the message. All this make sense?

    Now returning to the RPC over IPC discussion that started all this…recall:

    The question which I’m currently asking myself is, would there truly be any major drawback to only implementing message-passing IPC in the kernel (which will be required anyway for efficient dataflow programming) and building RPC primitives through a combination of libraries and programming language extensions on top of that basic communication layer.

    The RPC protocol will need to to identify async requests between the client and server. You suggested using handles, which implies userspace RPC lookup tables. I suggested callback pointers could avoid the overhead of lookup handles, however pointers are not “safe” from manipulation in the remote process. Therefor, I assert a drawback to building userspace RPC over IPC is the inability to safely use callback pointers in order to eliminate RPC lookup tables. In retrospect it may not have been worth bringing any of this up, I don’t want to bog you down with premature optimization.

    Here are some examples showing async code that uses callback pointers and no userspace tables/lists anywhere. The server is a threaded/async hybrid and the client is async. They were just written quickly to test some things and don’t do much, but maybe you are interested in seeing the pointer technique at work. They also make use of epoll’s event batching to reduce syscalls.

    http://geedmedia.com/neolander/

  14. It is my understanding that in this scheme, step 5 will necessarily require some kind of lookup table structure to be performed. Even if the kernel tried to bypass the need for it by keeping the call context alive in a blocked kernel thread (ignoring for the moment all the issue with that blocking approach), that thread would still be blocked waiting for an event to be triggered on the server-side.

    Ah, let me step back a bit. The IPC mechanism used by the kernel would still be using handles to identify the descriptors in kernel syscalls. While in theory a kernel pointer could be used, that just wouldn’t be secure at all. So the kernel will always have to look up the process handles in order to perform the IPC between processes. But userspace processes don’t have to do any lookups.

    So long story short, the idea behind the callback pointers is to avoid having lookup tables in userspace in addition to the one in the kernel. The userspace callback pointers would actually be stored in the kernel’s handle table. When an async notification is triggered it would be passed into the userspace process along with the rest of the message. All this make sense?

    If I understand well, the issue which you’re raising is that in the kernel-side implementation case, one lookup has to be performed, whereas in the client-side implementation case, two lookups have to be performed. The underlying assumption being that some lookup work would then be performed twice, which is inefficient. However, I am not sure that this is actually true.

    In the kernel-side scenario, when the server returns, a lookup is performed to locate a pending async RPC call. This could be done in one step if there is a large global list of pending RPC calls, but since maintaining large lists of wildly varying length is difficult, it will perhaps prove to be a better idea to do it in two steps, by first looking up a client process or client-server RPC connection within a primary lookup table, then looking up the pending asynchronous call within that secondary lookup table. These are two different ways of doing the same thing, and their CPU/memory overhead should be similar.

    In the client-side scenario, the kernel is not aware of the concept of separate asynchronous calls circulating across a single message-passing connection. All it will do, when receiving an asynchronous return message, is to look up the associated message buffer on the client side and send it there. This is equivalent to the first lookup of the two-step lookup process described above.

    Then, on the client side, the RPC packet will be decoded, and an identifier will be extracted, which can be used to look up the specific pending asynchronous call that is being referred to. This is equivalent to the second step of the two-step lookup process described above.

    So if I’m not misunderstood, the same lookup work is being performed in both cases, it is only being spread differently between kernel code and client code. Would you agree with that conclusion?

  15. Alfman permalink

    Hadrien,

    If I understand well, the issue which you’re raising is that in the kernel-side implementation case, one lookup has to be performed, whereas in the client-side implementation case, two lookups have to be performed. The underlying assumption being that some lookup work would then be performed twice, which is inefficient. However, I am not sure that this is actually true.

    Look at the client and server closely and observe that they don’t need any lookup structures to facilitate the async events. In fact to add new clients one can simply allocate them on the heap and then tell the kernel their memory pointer and that’s it. No lookup tables at all. The theory behind my async library was that everything that could possibly act on an object would do so asynchronously through pointers (rather than handles) including such things as timers, sockets, threads, etc. I think the linux kernel mechanism carries legacy baggage and could be improved upon, but I hope it convinces you of the feasibility of the pointer method.

    In the kernel-side scenario, when the server returns, a lookup is performed to locate a pending async RPC call. This could be done in one step if there is a large global list of pending RPC calls, but since maintaining large lists of wildly varying length is difficult, it will perhaps prove to be a better idea to do it in two steps, by first looking up a client process or client-server RPC connection within a primary lookup table, then looking up the pending asynchronous call within that secondary lookup table. These are two different ways of doing the same thing, and their CPU/memory overhead should be similar.

    To me the problem with big arrays is that their size may cause the active state to grow bigger than the CPU caches can hold, which would cause system-wide performance degradation. In your case I don’t foresee concurrent RPC tables becoming too big simply because your talking about fixed local processes rather than scalable network ones. A further scalability issue might arise If the table needs to be protected with a mutex, creating a serial chokepoint.

    Then, on the client side, the RPC packet will be decoded, and an identifier will be extracted, which can be used to look up the specific pending asynchronous call that is being referred to. This is equivalent to the second step of the two-step lookup process described above.

    So if I’m not misunderstood, the same lookup work is being performed in both cases, it is only being spread differently between kernel code and client code. Would you agree with that conclusion?

    I understand what you are saying, but don’t agree. Consider that if the RPC packet contained a pointer, the process could use that to pull up the pending async RPC records instead of looking up ids in a structure. The shortcoming being that it requires the pointer to be trustworthy. IMHO the pointer method is actually easier to implement once you get used to it since it avoids the need for managing additional structures. I’d like you to understand the pointer approach, even if you end up not choosing it.

  16. Look at the client and server closely and observe that they don’t need any lookup structures to facilitate the async events. In fact to add new clients one can simply allocate them on the heap and then tell the kernel their memory pointer and that’s it. No lookup tables at all. The theory behind my async library was that everything that could possibly act on an object would do so asynchronously through pointers (rather than handles) including such things as timers, sockets, threads, etc. I think the linux kernel mechanism carries legacy baggage and could be improved upon, but I hope it convinces you of the feasibility of the pointer method.

    (…)

    I understand what you are saying, but don’t agree. Consider that if the RPC packet contained a pointer, the process could use that to pull up the pending async RPC records instead of looking up ids in a structure. The shortcoming being that it requires the pointer to be trustworthy. IMHO the pointer method is actually easier to implement once you get used to it since it avoids the need for managing additional structures. I’d like you to understand the pointer approach, even if you end up not choosing it.

    Make no mistake, I sure see the merit in having a library or kernel functionality that can associate a pointer to a remote function/procedure call. The only thing which I question here is whether such functionality can be implemented without the assistance of an underlying lookup table mechanism, be it based on a centralized kernel table or a number of smaller user-mode tables.

    In the specific code which you propose, the place where I would look for such a hidden lookup table is the epoll and sockets implementation. When the client code sends data to a socket, the kernel obviously cannot trust the client for providing him a pointer to the associated kernel structures, so it has to parse some kind of lookup table to find if the “handle” provided by the client is valid and, if so, which data structures are associated with it. Only then may it fetch back the data_ptr provided by the server, and send the whole thing to the server for server-side event handling.

    To me the problem with big arrays is that their size may cause the active state to grow bigger than the CPU caches can hold, which would cause system-wide performance degradation. In your case I don’t foresee concurrent RPC tables becoming too big simply because your talking about fixed local processes rather than scalable network ones. A further scalability issue might arise If the table needs to be protected with a mutex, creating a serial chokepoint.

    Well, surely the table could be read concurrently, and synchronization would only be needed when writes are to be performed, as the writer thread would need to wait for read completion and make sure that not other writer or reader is active during table modification. Actually, when implementing a memory manager in kernel code, I ended up with a similar situation and wondered if there was a standard synchronization primitive that is appropriate for that kind of purpose, considering how widespread that concurrent reads/serial writes scenario is.

  17. Alfman permalink

    Hadrien,

    Make no mistake, I sure see the merit in having a library or kernel functionality that can associate a pointer to a remote function/procedure call. The only thing which I question here is whether such functionality can be implemented without the assistance of an underlying lookup table mechanism, be it based on a centralized kernel table or a number of smaller user-mode tables.

    I thought I already agreed to this earlier, but in any case I agree now. The kernel needs to have it’s own handle table in the name of security to implement IPC.

    Your earlier description, as I read it, seems to be describing the need for another user-space lookup table to implement the RPC over top IPC, but I assert it’s possible to do without a userspace lookup table.

    I can think of two variations:

    Variant A. One to one IPC per RPC
    Step 1. Initialize IPC (either on demand as needed, or keep in pool)
    – open IPC handles via syscall
    – initialize userspace RPC structure
    – RPC structure contains at minimum IPC handle plus RPC function callback
    – syscall to setup IPC handle for async callbacks and associate with RPC structure address

    Step 2. Invocation
    – userspace invokes RPC request, passing in RPC function callback.
    – get RPC structure from linked queue (or generate on demand)
    – build RPC packet
    – save function callback in RPC structure.
    – write RPC packet to kernel via IPC
    – return asynchronously

    Step 3. Return.
    – userspace receives IPC notification along with callback pointer.
    – userspace casts the pointer into an RPC structure.
    – retrieve RPC packet from kernel, extract return information.
    – take the RPC function callback from the RPC structure
    – call RPC function callback, passing return value and other context useful to the caller.
    – free/return RPC structure to pool

    Pros:
    no userspace lookups
    secure, the RPC server is not responsible for returning pointers to the caller.

    Cons:
    To to have as many IPC handles as expected RPC concurrency.
    The need for a pool to avoid rapidly open/closing of IPC handles.

    Variant B. Many RPC calls per IPC
    Step 1. Setup IPC
    – Open one IPC to the RPC server

    Step 2. Invocation
    – userspace invokes RPC request, passing in RPC function callback.
    – build RPC packet, saving RPC function callback to packet
    – write RPC packet to kernel via IPC
    – return asynchronously

    Step 3. Return.
    – userspace receives IPC notification
    – retrieve RPC packet from kernel, extract return information and RPC function callback
    – call RPC function callback, passing return value and other context useful to the caller.

    Pros:
    no userspace lookups
    no limit on the number of concurrent RPC per single IPC handle

    Cons:
    RPC function callback can be manipulated by RPC server

    In both cases, I acknowledge that the kernel will need to have it’s own lookups in dealing with the RPC processes’s handles.

    Well, surely the table could be read concurrently, and synchronization would only be needed when writes are to be performed, as the writer thread would need to wait for read completion and make sure that not other writer or reader is active during table modification.

    If you’re willing to forgo portability, cpu atomics often enable very clever synchronization without any mutex at all. However in this case I don’t see a way to avoid a mutex due to the fact that we need to block all accessors while the table is being resized.

Trackbacks & Pingbacks

  1. TOSP Quarterly, issue 2 | The OS|periment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.