I hate today’s mass storage devices, and you should hate them too. They are slow, short-lived, and ridiculously fragile. It just sadly happens that we have to cope with the disappointing state of the art in this realm for long-term data storage. But is there a way we could design operating systems and applications to minimize the impact that these devices’ low performances have on our lives, instead of constantly be waiting for the blinking HDD light ? Here are some ideas…
Reducing mass storage use
Green-minded people regularly say that the greatest, cleanest, and cheapest source of energy on Earth is energy savings, and they may have a point : when operating systems start to weight tens of gigabytes for a relatively basic feature package, as in recent versions of Windows, working on disk usage optimization is quite an inefficient way to go. Engineering workforce would be better spent on making more durable infrastructure instead of changing it every year, cleaning up the overall design, and eliminating redundancy. Also, the necessity to load system component X or Y on every boot should always be questioned.
For more feature-packed operating systems such as Mac OS X, HDD use reduction may be achieved through modularization, making sure that only relevant bits are loaded on each disk access. This typically involves cutting large frameworks in smaller parts, putting extra kernel functionality in independent modules, and applying the old UNIX philosophy of “Do one thing and do it well” to OS components. Caution should however be applied to make regularly used stuff (system menu, file explorer…) regularly available, as will be discussed in the prefetching section.
Finally, caching should obviously be used to avoid loading the same stuff from disk over and over again, but modern operating systems do this pretty well already.
Using nonblocking reads and writes
Disk IO is extremely slow at the scale of program execution, so software should probably not be forced to either wait or spawn extra threads as data is read from or written to a mass storage medium. As gamers and workers alike can attest, there is nothing more boring in this world than the sluggish evolution of a progress bar as each file is loaded, which is why many modern video games opt to keep the player busy during loading times by using them for storytelling purposes. I admit, scaling this approach to work-oriented software would not necessarily be trivial, although it could perhaps put these annoying “tip of the day” dialogs to some good use in resource-intensive applications.
Blocking IO can also be counter-productive from the point of view of performance optimization. As an example, when files are loaded from two different physical drives (typical use case : opening a document stored on an external drive), there is no reason whatsoever why software should have to sequentially access both drives instead of just unleashing the power of modern computer buses. Even when only considering a single drive, the performance of modern mass storage media can be greatly optimized by letting drivers read and write blocks of data in batches instead of in a sequential fashion : there is really not much to loose by ordering pending disk access requests in an optimal fashion while one earlier access request is being processed.
Nonblocking IO is especially important in the case of writes, because unlike reads which are generally used to fetch data that is useful right now (more about prefetching in a moment), disk writes are mostly used to save stuff for “later”, a time which is generally at least a few minutes away. I will not pretend that there are no problems with caching and delaying every disk write, though, and in fact I can mention the following issues that would have to be handled :
- When writing data to an external drive, software should take care of the event that the drive will be unplugged. Not every user is fond of “safely remove” features, and there should be an easy way to know that data is being written to a drive without having to attempt to remove it (e.g. putting a “busy” overlay on the drive’s icon).
- Any data that is scheduled for writing but not yet written is vulnerable to crashes and power outage events. For that reason, writes should always be committed to disk in a reasonable time frame (that has to be determined).
As operating systems grow bigger and bigger, prefetching becomes an important issue. Basically, the idea is that in some circumstances, if causality is fully respected, programs cannot avoid waiting for IO. The typical situation where this happens is the point at the beginning of a process’ lifetime where all code and data is loaded in RAM. To avoid an irritating wait to users at this point, a strategy is to have speculatively loaded frequently used disk files in RAM before they were actually needed, typically during OS boot or when the computer is idle.
While the principle is nice, a good prefetching implementation must respect the following criteria :
- Like any background processing task, prefetching should not disturb the operation of foreground mechanisms.
- In the case of boot-time prefetching, the total amount of fetched data should remain small, as the user is waiting in meantime. To pick which files will get the treatment, it may be better to use slowly learning algorithms than permanent incorrect assumptions.
- Even for run-time prefetching, one should aim at reducing the overall stress induced on mass storage media at run time, in order to increase their lifespan and reduce power consumption. Keeping disk drives permanently running on an idle system, successively loading all files from disk by decreasing order of usefulness, would not be a profitable endeavor.
In the end, what emerges would be a multilevel prefetching system in which a file can be in any of the following states :
- Loaded on boot because it is needed at that point (kernel, keyboard driver…)
- Prefetched before the user is given control of the computer (small cache, used to avoid the “Windows XP effect” in which it seems that the user is deceptively logged in before the computer is actually ready to run anything)
- Prefetched after that (larger cache, used to provide an optimal performance on a system that has been running for some time)
- Not prefetched (the majority of files)
The management of these various prefetching states should mostly be done programmatically, such as by doing statistics on the time after boot at which each file is needed on the average, then filling caches whose maximal size is determined as a function of hardware parameters (usage of battery power, disk speed, RAM amount…) and may be changed by power users. The maths necessary to keep the prefetch caches up to date should be done as an unobtrusive scheduled background task.
I am well aware that we have reached a point where it would be unreasonable to ask of users that they use a static in-ROM operating system instead of struggling with the limitations of modern mass storage devices all the time. However, it is my hope that application of the tips presented in this article could significantly improve the performance of current operating systems as far as dealing with those is concerned. Thanks for reading !