Hello again everyone! While dealing with the Unicode issues described in my previous post, it appeared to me that I probably needed a better module management mechanism than passing structs around or turning them into global variables. In this post, I am going to describe the design of a new kernel component, ModuleManager, whose goal is to do just that.
Clarifying what modules are
Before going further, I should clarify that what I call “modules” here is not a module in the Linux sense of a shared library implementing kernel functionality. As far as I’m concerned, kernel modules are just files, which are loaded into RAM alongside the kernel by the bootloader because they are required at an OS initialization stage where filesystem and mass storage management is not initialized yet. In fact, even calling them “kernel modules” is kind of misleading, since they probably won’t be restricted to kernel use, but rather be used to store every executable file and resource that is to be used before mass storage is ready.
It is also worth noting that although the isolated files are currently kept separate on the OS storage media and loaded as separate entities by the bootloader, this does not have to be the case. If performance considerations end up dictating it, I am also ready to use some kind of ramdisk to store all modules in one big flat file that is loaded all at once by GRUB, as done with Linux initrds. In that case, extra processing would be needed to extract a single module from the ramdisk, so I would need an abstraction that leaves me such an option.
From this, it appears that the amount of OS modules is likely to vary significantly over time, and that the kernel alone does not fully determine it. Thus, managing these cleanly would require the use of dynamic memory allocation, and we can consider this to be a first, hardly avoidable dependency of a module manager.
Modules also bear names, which in the general case will be filenames written in Unicode, so ideally we should also wait for Unicode string handling to be ready before starting the module manager. But that cannot be done, since initializing Unicode strings requires, in itself, the assistance of a couple of modules. So there needs to be an ASCII-restricted subset of either the module manager or string handling routines which can be started first, before Unicode support can be initialized shortly thereafter. I prefer to put such a “restricted mode” in the module manager’s code, since that is consistent with the way other kernel components work, whereas strings, on their side, are implemented as a library.
As far as I can tell, nothing else will be needed to manage kernel modules properly.
Tasks to be performed
However done on the inside, a module manager should be able to do perform two tasks : locate modules in RAM by their filename, and liberate the RAM of modules which won’t be needed anymore. Locating a module means to indicate its RAM location and size to the requester if it exists, and cleanly fail if it doesn’t, which should in itself be a simple task.
Liberating modules, however, is more tricky as soon as a given file is to be shared by two or more unrelated OS services, because there is no easy way to know if all processes which needed access to a given module have gotten it yet. The most elegant way to do it which I have found so far, is that as soon as mass storage management is available, the module manager should start redirecting all module access requests to it, and free up in-RAM images of modules as their remaining users are done with them.
In this scheme, there should also be a way to ensure that the filesystem manager won’t reload modules which have already been sent to RAM by the bootloader. Ideally, there should be a way through which the module manager could feed the mass storage subsystems’ caches with its module images, but such a scenario might also prove to add more design complexity on mass storage management design than it’s worth. At this point, without having worked on filesystem and mass storage matters yet, it’s hard to tell.
Module management implementation
Like most other kernel components, module management would be implemented in the form of a class, called ModuleManager, an instance of which would be instantiated during kernel initialization, after memory management, and would initialize Unicode string handling as part of its own initialization.
On the inside, said class would maintain a table of per-module descriptors (such as filenames, RAM location, size, amount of users and associated handls, filesystem data…), while on the outside it could be controlled using the following set of methods :
- ModuleID request_module(PID, KUTF32String&): This function is used to let a process request access to a module, labeled using its Unicode filename. It returns a unique nonzero integer identifier, which can be used in further negociations with the module manager, or zero if the requested module is nowhere to be found.
- ModuleID request_module_ascii(PID, char*): This is a variant of the previous function, which accepts ASCII filenames. In practice, it will most likely be implemented as a wrapper that works using hand-made Unicode strings, and is to be used during ModuleManager initialization only.
- ModuleDescriptor get_module_descriptor(PID, ModuleID): Given a module identifier, this function provides a description of the module’s properties, such as its RAM location and size. If provided with an invalid module identifier, this function nicely fallbacks by providing a module descriptor with NULL module location, partly so as to ensure that the careless caller crashes promptly without further damage.
- void liberate_module(PID, ModuleID): This function is used to notify the module manager that a given module user doesn’t need it anymore.
- void file_system_ready(<unknown parameters>): Once filesystem and mass storage management are ready, they should call back the module manager, so that it silently switches to a different module management method that silently loads missing modules from disk and discards modules which have no remaining users. Before doing the latter, the module manager should also send back its remaining module images to the mass storage management services for caching purposes. Since I have not started work on mass storage yet, I can’t give much more details on this method for now, but the parameter would most likely be a RPC interface allowing the module manager to send back its data.
Clarifying module & Unicode management interaction
Here is how I propose to initialize ModuleManager in absence of proper Unicode string manipulation facilities:
- ModuleManager’s constructor is called, initializes most internal structures properly but interprets filenames as pure ASCII for now (one byte = one code point, no string normalization)
- The constructor then calls the Unicode management initialization procedure (currently called InitializeKString())
- Unicode initialization uses request_module_ascii() and get_module_data() to access the relevant modules, initializes Unicode support, then calls liberate_module() on the modules which aren’t needed anymore
- Control returns to the ModuleManager constructor, which reinitializes module filenames to their proper values and possibly disables the request_module_ascii() method
I think that would be a fairly clean way to work around the missing string management resources in early booting stages.