In a way, I’m glad I have a few remaining topics for this summer update. That’s because publishing my preliminary RPC test results has sparkled an unexpected but very interesting conversation with Brendan about the theoretical foundations of this IPC mechanism, that I thought were now a given. If his criticism turns out to be legitimate, I’m good for giving up on some of my killer features, and spending a few more months in design stage. But heh… It’s not as if I wasn’t ready to spend a few more months designing stuff if it means a more perfect final product. That’s OS development we’re talking about, after all… If I wanted instant gratification with a minimal amount of effort, I would be developing fart apps for mobile devices.
Going further in the “long-term thoughts” part of this summer update, I’ll now be bringing up some long-term ideas about how I’d like to do graphics in the early days of this OS’ graphic stack, using the VESA BIOS extensions, even though I’ll obviously keep the option to do things otherwise latter. I’ll noticeably explain what motivates this choice, what the dirty implications are, and how I plan to live with them. I’ll also add some design thoughts that are not VESA-specific but could well survive a possible latter switch to a GPU-based architecture.
Ode to the joy of modern graphics
Graphics are a mess, both on IBM compatibles and on ARM stuff. Every GPU manufacturer in this world has forgotten what the “standards” word means, and the only thing which allows user-space software to consistently run across multiple hardware is abstractions layers implemented in software. And when I say “consistently”… Well… You know why consoles and Apple iDevices are so popular among game developers as compared to desktop PCs and Android devices respectively, isn’t it ?
The situation is least bad in the x86 world, because this architecture comes from a time where hardware manufacturer actually cared about low-level developers and DRM crap didn’t exist, a time where hardware specs were opened up and where it was common to write software over the bare metal. These days are certainly long gone, but that doesn’t mean that there aren’t some useful remains around. On of those remains is a culture of openness, which leads Intel and AMD to consistently release development manuals for the hardware they produce (only leaving NVidia as the bad duck there). Another one is legacy standards, like the VESA BIOS Extensions (VBE).
When you begin in the world of alternative OSs, you are pretty much alone with your keyboard and your chair. As such, there is strictly no way you’ll manage to have the manpower it takes to write a range of hardware-specific drivers for every GPU in existence, and rewrite them each time a new chipset family comes out due to manufacturers enjoying breaking stuff for the fun of it. Even mature codebases like Linux, with thousands of developers, don’t manage to get it right. It’s simply impossible without having the manufacturer writing the driver for you, and even that won’t save you considering that according to Microsoft’s statistics, the manufacturer-provided GPU drivers are the main source of crash on Windows, the OS which they are mainly developed for.
As such, you want to avoid relying on GPU-specific drivers as long as possible, and wait until your OS is mature enough that it can attract the workforce it takes to (vainly attempt to) handle this hell. This is where legacy standards come into play.
What VBE does, what its limitations are, and why hobby OSdevers like me want it anyway
As hopelessly disorganized as the world of low-level computer graphics may be, there actually are some standards body within it. Most prominent and ancient of those is VESA, the Video Electronics Standards Association. They are the ones behind the DisplayPort monitor connector and the DDC/EDID protocols which modern graphics hardware and operating systems use to communicate information about computer displays. But they’ve also created some technological wonder for hobby OSdevers : the VESA BIOS extensions (VBE). These are standard BIOS features, supported by every single GPU out there, which noticeably bring the following attractive features on the table :
- Detect pretty much everything about screens which are attached to the computer
- Switch the main screen to a high-resolution graphics display mode and draw stuff on it on a per-pixel basis, with access to fancy functionality such as triple buffering.
That’s huge news, because that’s what most OSdevers want when they begin to design a GUI. After all, the “computational” parts of GPUs are only good for very specific applications (gaming, heavy 3d rendering…) that hobby OSs are unlikely to attract in their early life. For normal GUI operation, we only want a gigantic buffer to draw stuff on, and VBE provides just that.
So, where’s the catch ? Well, VBE is growing old, and with its age come the following issues :
- 16-bit code : Like most BIOS functionality, VBE mode works best when the processor is running in the super-legacy 16-bit “real” mode. Newer version of the standard offer a hackish 32-bit interface to this 16-bit code, but all in all the logic remains that of real mode, and it still won’t run in 64-bit mode, which all modern processors should be running on. What this means is that accessing VBE functionality requires either temporarily switching the CPU back to real mode (only acceptable in the early stages of kernel boot) or emulating the real mode code of VBE on top of the running 64-bit kernel (possible any time, but requires some work). Without the ability to execute real-mode code, only drawing on an already initialized display is available.
- Poor manufacturer support : VBE has adapted itself to the spreading of wide screens, by adopting a model where video chip manufacturer specify themselves which video modes (resolution, refresh rate, bits per pixel) are available in hardware. However, manufacturers have not fully kept up with this innovation, because at this point the “one-driver-per-device” model had become the norm and they didn’t feel like doing their homework of properly supporting old standards. What this means is that frequently, only 4:3 video modes and very low-res wide video modes are available, and software must make a though choice between a highly low-res display and a distorted display which requires software adjustments before circles stop having an elliptic shape.
- No multiple displays : Historically, VBE only supported computers with one single display. At some point, multiple displays became common enough that VESA couldn’t ignore that possibility any more. But instead of adding “extended” video mode manipulation commands that allowed a developer to control the video mode of each screen, VESA chose instead the following “solution” : controlling one display at once, and advertising video modes for both displays. I can’t stress enough how horribly ugly this solution is, but for now sufficient is to say that VBE doesn’t work well with multiple displays.
So, why do hobby OSdevers want VBE anyway in spite of all of its flaws ? Because it works all the time, and provides both a nice graphics driver to begin with and a nice fallback driver when more sophisticated stuff goes mad. And they will go mad, because a small alternative OS will never have the workforce or the influence on manufacturers it takes to properly keep drivers up to date with the evolutions of hardware. Heck, NVidia don’t even disclose their specs, and all of the Linux worlds’ graphics teams is still not enough to get a GPU-based graphic stack that doesn’t feel very rough around the edges.
Some graphics stack ideas
Now, all this talk about VESA is nice and all, but what video drivers will initially be used is close to an implementation detail anyway, because a good driver interface abstracts away the hardware/standard being used to draw things on screen anyway. So what if we talked about what kind of graphics stack this could take place in ?
Layer 1 : Graphics driver
At the bottom of the stack is the graphics hardware. It essentially provides us with two things : a way to display video frames on the screen (framebuffer), and a range of high-performance hardware tools for bitmap and 3D graphic manipulation. As we have seen previously, everyone can access a significant part of the former functionality in a standard way, but the latter requires dedicated drivers.
Operating systems which can afford the luxury to get dedicated drivers for each hardware may assume the presence of dedicated drivers and consider both functionality under the light of a unified device. Windows and Mac OS X can, some Linux distros think they can. We definitely can’t, so we have to ask that drivers provide two standard interfaces. One mandatory one for exchanging information about the screen, setting video modes, and displaying video frames from a standard format (kind of like the Linux framebuffer), and one optional one for all the accelerated stuff (OpenGL is an option, but it makes drivers heavy and causes lots of code duplication. Gallium 3D could be a better candidate there).
All services that are required for the system to work properly must work if only the former interface is available. Non-critical stuff, like 3D games, may depend on the latter. In the future, the non-accelerated VESA driver might also emulate the latter interface, if there is a demand for it, but it must be stressed that it will always be dog slow and that anything a bit complicated won’t work well over it. Try to run some GPU-accelerated applications with direct rendering disabled on Linux, in order to see what it might be like.
Layer 2 : Window manager
Processes, including system ones, share a number of screens (typically one) on which their graphical output is displayed. And like in human societies, each time something is shared in the computer world, system arbitration is required so that no one abuses his rights on the shared object. This is why the window abstraction exists, as a number of private bitmaps that a running program may use to display its visual output without any need to worry about what other programs are doing in their own corners.
The job of the window manager is to combine all these private windows into a single screen image, centrally arbitrating stuff like who is displayed, who is on top of who, etc. It also does the reverse conversion, determining which window an input event belongs to and appropriately handing them to the widget toolkit. It does not take care of the window controls (e.g. close button), which are part of the desktop shell instead (see relevant entry below for a longer explanation).
Layer 3 : Widget toolkit
The widget toolkit’s purpose is to manage standard UI controls, like buttons, canvases, or tabs. Give it a window, the incoming input events, and a resolution-independent spec sheet describing which controls are going to be put on it, and it will manage all their “reflex reactions” (hovering effects, popup menu appearance and disappearance…) and dispatch application-managed events to the relevant application (e.g. if a button is clicked, the widget toolkit will manage the button’s sunken appearance and notify the application that button X has been clicked through a standardized IPC message, which I currently envision as a remote call).
In circumstances which require it (e.g. games), applications may of course still manage their UI themselves. A typical way to do this would be only use a canvas that fills the window, and redirect all UI events associated to it to the application.
Layer 4 : Desktop shell and applications
Previous components were abstractions for developers, whereas the desktop shell is the abstraction for users. It is basically what gives user control over graphical applications. Task switchers, application launchers, and global window controls belong to this component — which can actually be split in several processes in implementation.
Applications also run on top of the widget toolkit and the window manager. The main difference between them and the desktop shell is one of security permissions : the desktop shell can do a few things which user software may only dream about, like keeping its windows on top of everything else, closing other user programs, switch tasks…
An aside : Do we really need to change resolution at run time ?
This is something I’ve been wondering for some times : could the graphics codebase be simplified by removing support for changing screen resolution at run time and replacing that with software-based resolution upscaling methods ? The idea is that…
- Only cathodic screens could, due to the way they work, cleanly display a lower resolution than their native one. Modern LCD- and OLED-based screens need to run some upscaling algorithm internally in order to display low-resolution graphics.
- The upscaling capabilities of most screens are, simply put, terrible, as compared to what can be done in software. Check the full-screen output of modern video players (VLC, Windows Media, Flash-based players…) when displaying a low-resolution video to see what modern upscaling looks like : blurry ? Sure. Horrible-looking heap of pixelated visual data where nothing looks clean ? Not so much. Same for modern video games emulators : they get pretty close to the output you’d get from a low-resolution CRT screen, and still run smoothly while doing so.
This prompted me to wonder why, exactly, people lower the resolution of their computer screens. As far as I can tell, the following reasons exist :
- Making stuff bigger when you have bad sight. This use case becomes obsolete with resolution-independent GUIs.
- Playing old games which can only run at a lower resolution. Well, TOSP won’t have “old games”, but if it had they would certainly run fast enough on modern hardware to make the overhead of software upscaling bearable.
- Playing new games when your computer doesn’t have the hardware it takes to run them at full resolution.
The last one is a tough one. If the game already runs badly, you probably don’t want to slow it down further by adding some expensive software graphics operation to the mix. However, a heavy game in a modern sense is a game with heavy 3D graphics, which requires a GPU to run, so if the user can run this game we can assume that we have a fully capable GPU driver around. Question is : how expensive is good upscaling if we can hardware-accelerate it ? Isn’t upscaling somewhat of a triviality for modern GPU ?
Well, guess I won’t have my answer until I get to graphics implementation, which is a long way away, but I thought it was potentially an interesting track to follow.