So, after discussing a long, and sadly non exhaustive, list of caveats to C as a language for OS development, many of which were passed on to other languages which tried to mimick C’s design and behavior, an important question remains unanswered: if it is not C, what alternative languages should one use to develop operating systems, in my opinion?
It’s hard to answer this question generally, as the number of programming language designs in existence is truly mind-boggling. But as you will see, by applying a number of reasonable criteria, it is possible to dramatically narrow the amount of languages that would fit realistic OSdever constraints well.
Mature and well-tested implementation
Operating systems directly deal with hardware, which is notoriously finicky when it comes to following standards, including its own spec. Consequently, to preserve their sanity, operating system developers must be perfectly sure that when they’ve isolated an incorrect hardware response to a correct software stimulus, the hardware, and not the software implementation, is to blame.
In practice, this means that you don’t want to develop an OS using an obscure, or worse homemade programming language. The technologies you choose to get onboard should have gotten sufficient traction elsewhere that many people have tested them, documented their oddities, and hopefully fixed most of their implementation bugs.
So if you want a shorter list of potential OSdeving languages to parse through, “most popular language” lists like TIOBE’s index would probably be a good place to start.
Proper specifications and no proprietary idiosyncracies
Some programming languages do not have a public specification, managed by a standard organization like the ISO or ANSI. Instead, they have a reference implementation, which is mostly documented by a few hundreds of badly written tutorials scattered across the web, and their creators make it clear at some point that implementation bugs are part of the standard.
Other languages make it worse yet, by burdening that reference implementation with heaps of patents and ridiculously restrictive licensing agreements.
For some applications, this may be fine. But for an OS, again, you want to know what you are doing, which involves having at hand a full description of how your language of choice works. And you’re going to do very weird things with that implementation in the end anyhow, so you don’t want licenses to get in the way at that time.
Stable like a rock
Operating system software is very hard to write, and consequently tends to be long-lived. This means that language compatibility considerations are more important than they would for your average business application.
So when a language has a known history of massively breaking compatibility with itself, like Python and D did, it should be considered a major red flag as far as OSdeving is considered. Only proceed with such languages if you have some strong assurance that the actors designing the language are well aware of the methodological issues that caused such a breach of compatibility, and have taken any measure necessary to prevent such an event in the future.
For an example of how to handle a major language redesign without breaking compatibility, consider Fortran. The language underwent a huge redesign from its 77 edition, designed for punch card systems, to its 95 edition, which removed (or, more exactly, made optional) much of the idiosyncracies of its older cousin, and added TONS of useful array manipulation features for its HPC niche to boot. Yet Fortran 77 code can be compiled by a Fortan 95 compiler without any modification, and gradually modified to use the new constructs of the improved language.
The need for compatibility is also one more reason why you should be wary of languages that only have a reference implementation and no spec: time passes, people come and go, and language runtimes wither and get replaced. Whereas specs are (almost) eternal.
Low-level and predictable
As I mentioned before, OS software interacts a lot with hardware, and hardware has some weird ways to express itself. In practice, this means that unless you like to mix programming languages gratuitously, your programming language will need to have native support for hardware speak.
In practice, hardware speak mostly involves directly accessing arbitrary RAM adresses, being able to pack data following a specific layout in RAM, and treating everything like a stream of bit to be decoded. If your language candidate has no support for this kind of tasks, you’re in for a hard time.
Another, more subtle aspect of hardware-software communication is that it typically follows very strict protocols, where the right action must be carried out at the right time, or else Strange Things will happen. This means that your language must offer ways to tell your favorite implementation of the day that it should not do whatever it wants with the instructions of the protocol in an attempt to optimize their performance.
A simplest example of this is the “volatile” keyword that is used, in many OSdeving-friendly languages, to declare variables that are always synced between RAM and the processor caches at the time you use them.
Keep the runtime lean
Nowadays, there is a growing tendency for programming languages to require bloated runtimes to run. Runtimes which have large processing overheads for seemingly simple operations, use up lots of memory, and most of all, LOVE to call the standard library functions of another language like C.
Obviously, for OSdeving, inefficient runtimes are a no-go. Many common OS tasks like interrupt handling and task switching must be done in a fraction of a milisecond, which may prove hard to achieve if even accessing numerical variables goes through one layer of indirection.
And as for standard libraries, keep in mind that they usually rely precisely on the kind of core OS functionality that you are building to begin with. So they won’t be available for a long time during your OSdeving process.
Abstract and modular
Since operating systems are complex software, special care must be taken to keep their code to a very high standard of cleanness. In computer programming, cleanness is achieved through a mixture of abstraction and modularization. So when picking a programming language for OS development, you should keep an eye on its suitability for large-scale development : code module system, functions, structures, classes, interfaces, metaprogramming…
Safe for uncomfortable use
Developing an OS is also pretty difficult, at every stage of the process, so any language help is appreciated. Features that offer protection against common programming mistakes like off-by-one array manipulation, language-provided guarantees on some pieces of code (e.g. “if you execute this code several time, it will always lead the same result” in functional languages), and formal analysis of developer-stated intents are worth gold to developers of an OS’ most critical parts.
Wrapping it up…
Let’s take a quick tour of TIOBE’s 100 most popular programming languages with these criteria in mind…
- Very high-level, runtime-heavy imperative languages like Python, Java or C#, don’t stand a chance against an OS’ efficiency, low-level access, and predictability requirements. Restricted subsets are sometimes designed to address this issue, as with the Sing# subset of C# that is used in Microsoft’s Singularity C# OS project.
- As operating systems must do many different things, domain-specific languages like R, SQL or MATLAB would probably not be a decent fit. Well, I *did* program an RPC message-driven client-server infrastructure in a scriptable data processing software, but I wouldn’t recommend that to any sane person.
- Functional languages like Haskell, though very interesting for tasks that are devoid of side-effects, easily become clunky and break down when used to develop low-level software. By its very nature, anything that interacts with hardware needs to have plenty of side-effects on the machine to be of any use. They also usually only offer limited ways to structure a program. Similar concerns apply to dataflow languages like Labview and Verilog/VHDL
- Once you’ve gotton these categories out, what remains is mostly older imperative programming languages with limited ability for programming in the large (the “other C’s”). Among those, Ada and Go stand out to me as particularly interesting for OSdeving, due to their higher-level abstract constructs for proper program structuration and relatively light runtime requirements.
- But Go officially stopped supporting access to arbitrary memory accesses in its latest revisions, in an attempt to make its GC design simpler and more accurate. So this effectively kills it for OS development. Ada remains, though, and is probably what I’d pick if I started this project again today.