Heap allocation is overrated

Hi everyone !

In case you are wondering, I’m not back working on TOSP just yet. My PhD has now entered its worst phase, with a dreadful last month of manuscript writing and an awfully ill-timed conference next week to make things worse. So I still cannot engage in a long-term personal programming project yet.

However, I have a little spare time left to code stuff I like at home. So I decided to practice my Ada a bit by translating some of Numerical Recipes‘ code snippets in Ada 2012, using my own coding style and a different name to avoid trademark lawsuits. I like it, it’s fun and full of bit-sized programming tasks that I can easily start and stop whenever I like. And at the same time, it also teaches a lot of enlightening stuff about how seemingly mundane decisions in programming language design, such as array bounds, can profoundly affect day-to-day coding in that language.

Which leads me to today’s thoughts on heap memory allocation and why I think we’re really overusing it these days.

Believing the heap

Like too many programmers, I was never properly introduced to the heap notion as part of my programming language studies. As I studied C, someone told me “to allocate an array of statically unknown bounds, use malloc”, and that was it. As I studied C++, someone told me “to make a linked list node, use new”, and that was it. As I studied Java, someone told me “to create pretty much any kind of object, use new”, and that was it. And so on.

It was only when I went into OS development and started to see how things worked under the hood, that I started to build a more solid understanding of what the heap was and which OS mechanisms it built upon. I learned about heap fragmentation, and the difficulty of building heap allocation algorithms that scale well with program demand. But even then, I was still pretty convinced that using heap allocation frequently was the right thing to do, except for those corner cases where using a “mere” stack variable would do.

I knew that the heap had its problems, from memory leaks to double free. I knew that garbage collection was, at best, a very poor hack, which sacrificed a lot of performance and system resources in order to make the heap’s shortcomings easier to cope with. But to me, the heap was like pointers : a necessary evil, present in every corner of the programming universe, although it might possibly be hidden by programming languages in a way which ultimately made things worse (as in Java).

And then I learned Fortran and Ada, and I realized that I needed much, much less heap allocations than before. I built a complete numerical simulation in Fortran 95 before I needed a single heap allocation/deallocation pair, right at the end, to manage a minor feature. And then, as I improved my knowledge of Ada using John Barnes’ wondrous (if expensive) teachings, I figured out that in a more flexible language, I wouldn’t even have needed that one heap allocation.

This got me thinking.

Then, exploring more advanced C++ books, I ended up on discussions of the wonders of RAII, which when applied to memory management is essentially a way to make heap allocation look like stack allocation. And this got me thinking some more.

At this point, I have reached the conclusion that heap-based memory allocation is in general harmful to software quality, aside from those few edge cases where it is strictly necessary and cannot be helped. I will now elaborate on why I think so, what those edge cases are, and the reasons I think we overuse the heap anyway.

The dark side of the heap

So, why would we want to avoid using the heap when it’s so convenient ? Here we go :

  • Good heap allocators are very hard to write, and usually only good for specific tasks. Inadequate allocators will badly hurt program performance (through algorithmic overhead and cache nonlocality) and waste system resources (through memory fragmentation) when used frequently. In contrast, stacks can easily be made to have good cache locality and cannot exhibit fragmentation.
  • Memory that has been allocated on the heap has no well-defined scope and must be liberated once and only once. Simultaneous liberation of many heap objects may come with significant and unexpected overhead if the heap allocator is hosted in user processes, so garbage collection is not a very good idea in programs operating under real-time constraints (such as, say, any user interface, which should respond to user input in less than 10 ms).
  • After heap-allocated memory has been liberated, it must never be needed again. Yet most heap implementation (including Ada’s) strongly encourage the use of pointers and aliased data, which makes this rule very difficult to follow perfectly in any mildly complex program. Some languages, such as C and C++, do not even guarantee that programs will crash or exhibit any other kind of well-defined behavior if they try to access deallocated memory, with all that entails in terms of security exploits.
  • Heap-allocated data has global scope, and as a close cousin of global variables must thus be used with caution in order to avoid falling into many bad coding practices. As with global variables, every heap-allocated block of memory is also one potential way to stab yourself in the back when you’ll need to make your program parallel later on.
  • Long-lived heap data, as encouraged by the global scope of said data, used by garbage collection as part of its basic design, and accidentally generated by memory leaks, is basically lost RAM for the rest of the system. For those who think that wasting RAM doesn’t matter anymore, consider how these days, even computers with 4 GB will run out of memory from time to time due to memory leaks and overuse of garbage collection, and ponder that there used to be a time where a few kB of RAMs were considered huge.
  • Basically, the only relatively safe and efficient way to use heap-allocated memory is to delegate the task to classes or function libraries that are specifically designed for this purpose, are the only ones touching the heap-allocated data all along its lifecycle, and make it look like stack variables.

So why, then, do we bother dealing with heap-allocated memory ? I believe the reason is twofold. The first one is that even in a perfect world, there are a couple of genuine use cases for heap allocation. And the second one is that our development environments can sometimes make the task of avoiding heap allocation excruciatingly hard.

Heap allocation done right

A first valid use case for heap allocation is to support unbounded, dynamically growable data structures. That still can be done on a stack using caller-allocated memory blocks and appropriate discipline, but it is profoundly clunky, ultimately more so than a well-designed heap-based class.

It is worth noting that this kind of structure is less frequently needed than it can seem on first sight :

  • The ability to allocate arbitrarily-sized blocks of data on the stack at runtime, as provided by Ada and C99 (through variable-length arrays), for example, covers many use cases of dynamically allocated structures when dynamic growing and shrinking isn’t actually needed.
  • Bounded containers that can grow *up to* a certain size, as implemented by Ada’s standard library, can be allocated on the stack. They can fit the bill for many real-world use cases where a maximal size can be specified, even if they are somewhat wasteful if that size is chosen too conservatively.

A second valid use case for heap allocation is sharing data between unrelated software processes. To share data in a controlled way entails allocating that shared data in specific pages of RAM that aren’t mixed with other, process-private data, and that is a perfect job for the heap and a poor job for the stack. Besides, the limited scope of stack variables would prove problematic if applied to shared data, whereas it is trivial to reference-count shared heap pages and wait until all processes have freed them before actually reclaiming their memory.

A third valid use case for heap allocation is implementing large caches, where some memory is allocated by the application but marked as non-vital and reclaimable by the operating system. Although such objects are somewhat hard to manage because code must always remember to check for their availability, they can be vital to the correct implementation of things like filesystems and databases, where it is critical to use the available RAM to improve system performance, but equally critical not to hog all of the system’s RAM with stuff that can be readily reloaded from mass storage or the network as needed.

Finally, a last and common but highly dubious use case for heap allocation is to cope with a programming language or operating system that makes allocating data on the stack needlessly difficult. This will be the topic of our last part.

Stack implementation headaches

First, in some languages such as legacy C (read : the latest versions of C that many commercial development environment support properly), it is impossible to allocate data on the stack if the amount to be allocated isn’t known at compile time. This is a totally artificial restriction (allocating a variable on the stack is as simple as moving the stack pointer), that boils down to an oversight from programming language designers.

In other programming languages, such as Java or Python, developers have little to no control on memory management, as that’s considered too arcane a topic for mere mortals to witness. So they must trust their implementation to do the right thing. Often, when it comes to stack vs heap allocation, the implementation will do the wrong thing.

Finally, many OSs do not support dynamically growing program stacks, and will just let said stacks grow up to a certain size before silently crashing the process. This prevents using the stack to store large objects, which is sad because the OS can always easily reclaim the space of a grown stack later, once a program doesn’t need it anymore, whereas a leaked large heap object is lost RAM forever.

But frankly, on any modern computer platform supporting paged virtual memory, there is absolutely no excuse for this, because contiguous virtual memory pages do not need to correspond to contiguous physical memory pages. And with 64-bit addressing and a sensible process virtual memory layout, there is no way the growing stack would encounter anything in its way. Only OSs running on less gifted computing platforms might not want to offer such a feature, and we modern PC users shouldn’t need to cope with inferior stack management practices because some legacy or embedded hardware doesn’t support the best way to do things.

Conclusion

As a summary, I would say that after spending some quality time with programming languages that allocate data on the stack rather than on the heap as part of their idiomatic constructs, I’ve reached the conclusion that we are really overusing the heap as programmers, and do so mostly because we’ve been taught this way, without a good explanation of what the heap is and which tradeoffs its use involves.

The heap has many problems. It is fiendish to implement, and if badly implemented harmful to performance and efficient system resource use. It is horrible for developers to manage, and offloading its management to garbage collection and “manager” classes is just opening the door to a whole new world of problems. Memory leaks, double free and use after free are serious issues, and programming languages with ill-defined memory liberation semantics like C and C++ make them infinitely worse. And finally, perhaps most important, having to implement tons of abstractions that do nothing more that manage heap objects is no fun, in addition to being error prone.

Granted, the heap has its use cases, such as indefinitely growable data structures, and “special” forms of memory allocation like shared memory and caching. But my impression is that most of the time, the main reason why we use it as programmers, is either that we don’t know better, or that our programming languages, implementations and OSs make using the stack needlessly cumbersome. I think a good OS should make it easy for programmers to allocate as much data on the stack as they need to, instead of being a barrier in their way.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s