While last time’s post was supposed to be alone, a bit more thought over the past few days made me realize that it would need at least one little brother, because its implementation part was insufficiently developed, in sense that one could not easily derive code from it without a great deal of interpretation. In this “part 2”, I will begin to introduce the kernel components which manage each resource that was introduced last time. The point is to give relatively precise explanations of what they will do in the final design, and from that define the interface of the final code, from the point of view of kernel code (the system call interface that exposes this functionality to user-mode processes will probably be done last).
Process manager (class name : ProcessManager, instance name : process_manager)
This kernel component is the core of the process abstraction. It associates all the characteristics of a process (file name, access permissions…) with a unique identifier, the Process IDentifier or PID. It also keeps track of and communicates with all components of process isolation (in and out of kernel) each time a process is to be created, updated, or destroyed. I currently consider calling the entities that manage a part of process isolation “insulators”, because they isolate processes from their environment, and this terminology will be used in the remainder of this document.
As the number of processes that will be managed is not predictable, the process manager directly depends on dynamic memory allocation to work.
It would have the following interface to start with :
- As part of its initialization, the process manager runs the init_process() methods of the memory management components, so that their process-related functionality can be initialized.
- bool init_rpc(RPCManager&) : This function initializes the RPC-based functionality of the process manager, essentially communication with user-mode insulators. It is called by the RPC manager after it is initialized, see below for more details.
- PID add_process(ProcessDescriptor) : This function is typically used when a program is loaded, in order to create an empty process to store it. It is provided with a description of the process that is to be created (file name, isolation properties…). It adds a small entry about that process in the process manager’s internal database, then contacts all insulators and notify them about the creation of the process so that they update their own internals. Once everything is initialized, this function returns a unique unsigned integer PID, which may then be used to refer to the process among all system services. There is also a special returned value, PID_INVALID, which is used when an error has occurred.
- void remove_process(PID) : This function is used to destroy a process. It is provided with the process’ PID, and destroys everything about this PID in the databases of both the process manager and registered insulators.
- PID update_process(PID, PID) : Live updating is a feature that allows software, especially system one, to be updated without reboots or service disruptions. The new version of the process is first spawned and initialized in an invisible fashion, then at some point the new process takes over the work of the old process. This is when this function is run. It makes the new and old process swap PIDs, and transfers some resources of the old process (such as a subset of its RPC connections) to the new process, so that all upcoming work is given to the new process instead of the old one. It also politely asks the old process to terminate its current tasks and close, with a timeout in the event where said process should hang. The first parameter is the PID of the old process, the second parameter is the PID of the new process, and the result is the altered PID of the old process (which is the former PID of the new process) if successful, PID_INVALID otherwise.
- PID find_pid(KString) : This function looks up for a specific file name in the process database and returns the first matching PID, or PID_INVALID if there is no match. Keeping around multiple running copies of a given process can still be done by usual PID-based organization, but filename lookup is the preferred mean of finding a system service. Processes can be made invisible to filename lookup using a flag in their ProcessDescriptor.
- InsulatorID add_insulator(InsulatorDescriptor) : This function adds an insulator to the insulator database. The provided descriptor should provide enough information to contact the responsible services each time a process is created, destroyed, or updated. It should also give the process manager enough information to discriminate what is handled by which insulator each time a new process is created, and send information it to the appropriate services. The kernel should also be prepared for insulators of a given name to be initialized multiple times as part of live updating. What is returned is a unique integer identifier, which represents the insulator in the database, called the Insulator IDentifier or InsulatorID. A special return value, INSULATORID_INVALID, is used if an error has occurred.
- void remove_insulator(InsulatorID) : This function removes an insulator from the insulator database. It is provided with the insulator’s ID and removes everything about this IID in the process manager’s database.
RPC manager (class name : RPCManager, instance name : rpc_manager)
This kernel component manages communication between processes, through the controlled abstraction that is Remote Procedure Call, or RPC for short. In this abstraction, a “client” process is able to easily give orders to a “server” process in a server-controlled manner. This implementation is based on two concepts : RPC entry points on server side, or “services”, and connections to services on client side, or “connections”. It should be noted that the “client”/”server” terminology does not necessarily represent the role of each process in the operating system, as when the “server” process must return results to the “client” process, the roles are actually reversed.
The RPC manager manages a part of the process abstraction, as such it directly depends on the process manager to work. Since the amount of RPC services, connections and calls is not known in advance, it also directly depends on dynamic memory allocation. Also, RPC spawns threads within processes, so the scheduler must be initialized too.
The RPC manager would have the following interface :
- As part of its initialization procedure, the RPC manager calls the process manager’s and the scheduler’s init_rpc() methods so that their RPC-based functionality can be activated.
- PID add_process(PID, ProcessProperties) : Every insulator must present a method of this form. It is called by the process manager when a process is created, so that the associated insulator (in this case the RPC manager) may set up management structures for this process that reflect its initial state. It returns the process’ PID if the operation is successful, PID_INVALID otherwise.
- void remove_process(PID) : Again, this is part of the standard interface which every insulator must comply with. It is called when a process is destroyed, and allows the RPC manager to clean up its internal data about this PID.
- PID update_process(PID, PID) : The final part of the standard insulator interface, this function is called when a process undergoes live updating (see above). It takes the PIDs of the old and the new process as its parameters, makes these processes switch PIDs, and transfers any “transferable” RPC connection from the old process to the new one. It returns the new PID of the old process if successful, PID_INVALID otherwise.
- SID add_service(PID, ServiceDescriptor) : This method adds a service to a PID’s entry points. It takes as an argument the server’s PID and a description of the RPC service that is being provided (amount and type of parameters, default values, pop-up threads or asynchronous queue…) It returns an integer Service IDentifier if the operation is successful, and the special value SID_INVALID otherwise.
- void remove_service(PID, SID) : This method removes a service with a specified SID from a process’ entry points, and closes all connections to this service from client processes.
- CID add_connection(PID, ServiceRequest, PID) : This method is used to create an RPC connexion between a client process and a server-side entry point. It takes the client PID as its first argument, a “service request” which specifies what RPC service the client is looking for as the second argument, and the server PID as its third argument. The RPC manager then checks that the service request matches some entry point on the server side, does some compatibility magic if the client-side description of the entry point is outdated or too new with respect to the current server-side entry point, and creates a “connection” structure which can from then on quickly be used when client code wants to call server code through RPC. The result of this function is an integer Connection IDentifier (CID) which can be used to refer to the connection in the future, or the special value CID_INVALID if something has gone wrong.
- void remove_connection(PID, CID) : This method destroys an RPC connection. It takes the client’s PID and the connexion’s ID as an argument.
- void make_RPC_call(PID, CID, ParametersStack) : This method is used to make a RPC call. So if one part of the RPC manager has to receive beyond extreme performance optimization, it is this one. As parameters, this function takes the client’s PID, the connection’s ID, and a structure of type ParametersStack which specifies where the parameters to the remote function are located. On some CPU architectures, it can just be the client’s stack pointer after a stub function call has been made, along with a length measurement if variable amounts of function arguments are authorized. On others architectures, it will be an unholy mess of CPU registers and memory arguments. Anyway, an interesting thing about this function is that it returns nothing. This RPC mechanism is voluntarily made asynchronous by design, and optional request results would be provided by RPC itself. It is my hope that this will favor the creation of asynchronous APIs, that are more stable and multicore-friendly than synchronous ones, although with a different workflow.
Questions and answers
Here are some random design thoughts which may be of interest to you :
- Why do you not provide a way to create multiple processes/insulators/services/etc… at the same time, which could be nicely optimized ? Because most of the nice optimization that can be introduced in this case, such as pooled memory allocation, can only be applied if it is guaranteed that objects which have been created at the same time will also be destroyed/updated at the same time. If we allow dynamically adding and removing, say, RPC entry points and connections (which is quite useful for interpreted code and asynchronous job completion notifications respectively), then grouping the creation and removal of those becomes much closer to a hidden for loop, while giving users of those functions a false sense of improved performance. This doesn’t seem worthwhile enough to justify adding complexity to the interfaces with plural versions of the add_x, remove_x, update_x, or use_x functions.
- Why is the process’ PID specified in, say, remove_service ? Aren’t SIDs and CIDs unique ? I don’t think it is a bright idea to separate the management of services from the processes which they belong to. First, RPC entry point alterations will come from the server process itself, so the PID will be known at all time and there’s no need for a slow hierarchy-less flat service heap. Second, specifying (PID, SID) instead of just a PID-agnostic SID offers a simple protection against buggy code which tries to alter the wrong SID and ends up messing with another process.
This is it for the first two kernel components that take care of process isolation, namely the process manager and the RPC manager. I still have to talk about threads and their scheduling, I/O ports, interrupts, a possible redesign of the memory management interface to make it more consistent with the other kernel components, and the way all this kernel stuff will communicate with user processes. But I believe that it is a good idea to publish this now, before this article gets too long and too late for people to read it :) Also, it will allow me to get early feedback on the work that is already done, and the interface conventions that are part of it.