Well, I just had fun getting some line of code statistics about the project. These are very raw stats : number of lines in each files. Some people prefer to remove blank lines and comments from them.
Anyway, the total number of lines of code in the tree is 14557, spread accross 86 files. This means a mean number of 169 LOC/file, with a maximum of 2679 LOC and a minimum of 19 LOC. Root mean square is 342 LOC.
It’s interesting to see how code spreads across various components. Let’s first separate code written by third parties and separate the rest among the different components they belong to.
- Part which originates from other projects : 2826 LOC
- Bootstrap code : 3582 LOC
- Kernel debug code : 1553 LOC
- Testing code : 1468 LOC
- Rest of kernel code : 5128 LOC
When sorting per file type, we get…
- C++ files : 6026 LOC
- Headers : 5598 LOC
- C files : 2699 LOC
- Assembly : 234 LOC
And if we want the 10 heaviest source files :
- include/elf.h : 2679 LOC : Full depiction of the ELF standard (from the Linux kernel source)
- memory/kmem_allocator.cpp : 1322 LOC – Memory allocator
- arch/x86_64/debug/dbgstream.cpp : 912 LOC – Debug stream and other debug output facilities
- arch/x86_64/memory/physmem.cpp : 776 LOC – Physical memory management
- arch/x86_64/bootstrap/lib/kinfo_handling.c : 760 LOC – Retrieves various information for the kernel
- arch/x86_64/memory/virtmem.cpp : 578 LOC – Virtual memory management
- arch/x86_64/bootstrap/lib/txt_videomem.c : 488 LOC – The C equivalent of dbgstream.cpp
- arch/x86_64/bootstrap/lib/paging.c : 340 LOC – Sets up early paging before the kernel starts
- arch/x86_64/tests/memory/phymem_test_arch.cpp : 280 LOC – Physical memory management testing
- tests/memory/phymem_test.cpp : 273 LOC – Physical MM testing (part which is guaranteed to be arch-independent)
I think it’s safe to say that I won’t ever reach Tanenbaum’s achievement of having a modern kernel which weights around 4000 lines of executable code, even taking into account the fact that this removes all C and H files… Still wonder how he did that, though.
EDIT : New statistics, about object and binary size this time ;)
Final binary size is 82KB for the kernel and 37KB for the bootstrap part. Considering that the combined size of all intermediary object files is 306KB (spread accross 40 files, mean 7.7KB, max 38.8KB, min 0.7KB, RMS 8.7KB), we can safely say that the linker did its job very well.
Separating object files using categories as before…
- Bootstrap code : 54KB
- Kernel debug code : 60KB
- Testing code : 75KB
- Rest of kernel code : 117KB
When sorting by source file type…
- C++ : 250KB
- C : 51KB
- Assembly : 4.8 KB
And if we want the 10 heaviest object files :
- arch/x86_64/debug/dbgstream.knlcpp.o : 39KB
- memory/kmem_allocator.knlcpp.o : 34KB
- arch/x86_64/memory/physmem.knlcpp.o : 26KB
- arch/x86_64/memory/virtmem.knlcpp.o : 19KB
- arch/x86_64/tests/memory/phymem_test_arch.knlcpp.o : 16KB
- tests/memory/phymem_test.knlcpp.o : 15KB
- arch/x86_64/debug/display_paging.knlcpp.o : 14KB
- tests/memory/malloc_test.knlcpp.o : 13KB
- arch/x86_64/bootstrap/lib/kinfo_handling.bsc.o : 13KB
- tests/memory/virmem_test.knlcpp.o : 11KB
Interestingly enough, object files generated from C++ code seem much heavier than their C counterpart, since this time most of the weight comes from the microkernel. Another interesting result is the size of the binary resulting from assembly files compilation.