To start this new series of post, I will attempt to review a programming language which I’ve had the change to try at work during this month. What, you may ask, does this have to do with OS development ? Well, two things. First, I believe that this language could be the first in a long while to offer serious competition to C and C++ in the area of system programming, which decidedly lacks some programming language love. Second, as I shall discuss in more details over the course of this post, its design presents some interesting ideas, that could in particular be applied to this RPC concept which has been bothering me a lot lately as it’s close on my implementation roadmap but still far from having a finished, fully satisfactory design.
Why is C++ in need for a successor ?
OSdeving is a good way to benchmark what a programming language is truly capable of. It presents developers with a wide variety of programming challenges, from the low-level interaction with hardware and strong performance constraints to large project management and dynamic linking with a wide variety of application languages. All these newer languages that sacrificed low-level and high-performance capabilities for a bit of developer convenience, like Java and Python, miserably fail the OSdeving test since they basically need the support of an underlying mini-OS to work. Older languages like C and Assembly can do the job, but only at the price of intense suffering coming from their highly restricted feature set and the way they constantly trap developers into doing undebuggable mistakes. There are few options to pick between these two extremes.
I find myself choosing C++ for the job, now that I have some experience with it, but I would not call myself satisfied yet. C++ is a nice option for OSdeving as it couples lots of expressiveness and powerful capabilities with few runtime requirements, link-ability with C, and efficient binaries by today’s standards. But it is also highly bloated up with features and quirky. It looks like when Stourstrup created the language, he just went around asking his friends about all the features that they would like to have in C, then included everything plus the kitchen sink in a (logically) half-designed package. No one in the world can claim to understand and master the whole of C++, with its huge unreadable spec and heaps of common mistakes (read : developer traps), and I think that it is a shame. So, I thought, why not have a look at this new baby of Rob Pike and Ken Thompson (of UNIX, Plan 9 and UTF-8 fame) that pretends to finally offer a decent general-purpose alternative to the most powerful and despicable language of the last century ?
What is this Go thing about ?
Since Go is still a rather young programming language at the time where this review is written, the best place to find out information about it is undoubtedly the golang website. Maintained by the creators of the language, it features a wealth of resources about the language, including…
- The current version of the language spec, which is surprisingly short and easy to read
- A very in-depth tutorial and lots of tips on how to make the most out of the language
- Ready-to-use binary distributions and a “cloud” compiler
- Reference documentation for most of the available go packages
- Various community resources, such as a mailing list and an IRC channel
If you want to get familiar with the whole thing, I strongly recommend the tutorial as a starting point, as it is quickly completed and gives a nice overview of the language without requiring you to install everything. But for this post, I shall just quickly explain what are the language’s main advantages and drawbacks in my highly subjective opinion.
The sexy part :
- Overall, the language is easy to learn, being only slightly more complex than C yet much more powerful
- The syntax is very clean : there is no need for trailing semicolons except when they are explicitly needed and variable declarations and assignments are less ambiguous than they can get in C.
- Go does not have headers in a C sense, nor an “implementation section” like in Pascal. Code is instead organized in modules, and variable and method exportation is controlled by identifier capitalization (more on that later).
- Interfaces propose an interesting vision of the object concept. Basically, an interface is a set of methods, and any class which implements these methods follows the interface. Together with the ability to easily import all the properties and methods of one or more classes into another class, this object implementation solves problems which are usually dealt with through complex inheritance trees, templates and multiple inheritance in a more elegant way.
- Functions can return multiple results, and the caller can easily keep only the first results and reject the rest. This can be used to manage error codes, among other things. Exceptions are also available for those who do prefer that way of doing things.
- The “defer” keyword, which postpones the execution of a statement until the return statement of a function, is a very elegant way to release system resources such as mutexes after they have been acquired.
- Go also features “goroutines” : if a statement is preceded by the “go” keyword, it will be executed in an asynchronous fashion. That’s probably the cleanest platform-independent way to do multiprocessing which I have seen in a programming language to date.
- Switches accept any kind of boolean cases (such as “x < 15”) and do not fallthrough by default (no need for tons of “break” statements). There are also range-based for loops and Unicode strings, as can be expected from a modern languages. Fixed-size integers are part of the spec to begin with, unlike in C/++ where they belong to new language updates which not all compilers support.
The unimpressive part :
- Some core language features make extensive use of dynamic memory allocation. Therefore, it is likely impossible to write a proper OS kernel in Go. Depending on other runtime requirements, one could, however, imagine low-level Go services running on top of a thin C/C++ microkernel…
- …but the runtime requirements of a Go program are far from clearly stated at the moment.
- Go links reasonably easily with C libraries, but linking with C++ through SWIG is much harder (as can be expected). Therefore, the language may suffer from its young age in such object-heavy areas as GUI programming, where most libraries use C++ at the moment.
- Go has been designed by one of those garbage collection advocates who won’t let you manually free memory. This can result in unpredictable behaviour in low-memory conditions or under stringent real-time constraints. The GC can be manually triggered though, which is something, and so far I have not been able to expose unacceptably strong GC overhead in my test programs.
- Go also goes sometimes a bit too far in the way of simplicity for my taste, such as when it rejects object constructors (“better build objects that initialize as a bunch of zeroes” being used as an excuse) or destructors (“the garbage collector will take care of everything”). Making identifier capitalization meaningful breaks some valid coding conventions, and does not necessarily scale well to alphabets that do not have capitalization (such as the Chinese one).
Overall, I’ve got to say that I like it. Given a bit of love, it could well replace C++ in my heart as my programming language of choice above kernel level. But this won’t happen before the Go team has at least clarified the runtime requirements of Go programs. I’m just telling them about the issue right now, I’ll keep you in touch if I get some satisfactory answers.
And these interesting ideas for RPC that you mentioned earlier ?
Go’s development was initially funded by Google for internal use, so one of its design goals was to ease development of massively parallel software. If this reminds you of Erlang, you are not alone, and similarities do not end there. Go uses a lightweight threading mechanism called “gorountines”, in which one can easily schedule asynchronous tasks by putting a “go” prefix before the associated statement. The Go runtime’s scheduler then determines whether it is relevant to run these tasks in parallel or not.
Goroutines communicate with each other through a message passing mechanism called channels, which is used to pass objects of a defined type between two or more code paths. As a default, channels are synchronous : if a piece of data is sent into a channel, “sender” goroutines block until the data has been taken out of the channel by a “receiver” goroutine. Conversely, listening to an empty channel blocks the receiver. Go’s designers advocate using this behaviour as a synchronization mechanism, but channels can also be allowed to buffer a bunch of objects before blocking senders, much like UNIX pipes work.
Finally, an interesting channel-related core language feature of Go is the “select” statement. Structured like a switch statement, it actually allows a goroutine to listen to several channels at the same time (or randomly pick one if several messages are pending in channels), and to simply poll channel state and exit if they are all free of incoming messages through the “default” statement. Quite an interesting way to implement web servers, if you ask me.
But this is where I’m heading to IPC-related considerations. RPC with shared memory works much like a goroutine, or any other member of the thread family for that matter. You start a task asynchronously, and then you go do something else without bothering how and when the task will complete if you do not need to. In fact, the “go” statement could make some fine syntactic sugar for RPC with a few adjustments, but that’s not what I’m thinking about right now. The interesting part, for me, is that the Go and Erlang designers both encountered some of these problems that I discussed recently (returning results and errors in asynchronous routines), and they ended up opting for a message passing scheme to use it. Gives some weight to this option for solving the problem, so it is probably a path which I should consider more carefully.
It should be noted, though, that message passing in itself solves neither the issue of interrupting asynchronous tasks, nor that of easily probing task completion when no results are needed. Both issues could be addressed through the use of standardized message passing channels for these purposes though : as an example, one “signal channel” could be used by clients to send requests to servers about running tasks, while another “completion channel” could be used by servers to notify clients of task completion. This, however, does not solve the issue of what happens to these channels when clients are not interested in using them or caring about them. I have to admit that garbage-collected languages have an edge when it comes to solving these problems.
And that’s it for today ! As usual, feel free to share your thoughts !