I'll try a detailed "for dummies" explanation - not because I think you are a dummy, but because I faced the same confusion initially.
A binary basically consists of three parts:
* text - executable code.
* data - preinitialized variables.
* bss - a reserved space that will be assigned in RAM by the loader, but is not present in the binary since it's always initialized to 0.
Consider this pseudo-code:
Code: Select all
int a = 42;
// doing something here
The code for doing something would go to the text section, the 42 would go to the data section.
To explain bss is a bit tricky. Best use an example: In my kernel image (which is a multiboot compilant ELF), I have a 4k sized bss section. Those 4k don't show up in the binary, but the GRUB loader reserves those 4k when loading the image into RAM. I use them for an early kernel stack, until I have my own memory map / management set up.
There are usually more sections in a compiled binary, but those three will suffice for the early stages. Use objdump or comparable tools to pry apart your binaries to check what else is in there.
Now, let's say you have a couple of .o files, from assembling ASM code and compiling C code. Each of those .o files has sections. The linker's job is to merge those sections together so they form an executable.
Usually, the linker has a default script on how to do it, but as always, kernel space is different, so we write our own script.
"The following will define what pieces from the input files will constitute the sections of the linked binary."
"First, we'll set up a text (code) section, assuming that it will loaded to address 0x100000."
(The address is important so the linker can relocate any references in the .o files.)
"Define these three symbols in the symbol table, each of them holding the current addres (0x100000)."
We might use them later. In my code, I define a symbol at the beginning and the end of my bss section, so that my assembly code can refer to those symbols when setting up the stack. Another use would be to define symbols for the beginning of the .ctors and .dtors section so you can call global constructors / destructors from your code - but that's C++ specific.
"Insert, starting at the current address (0x100000), any .text sections from any input files."
"Pad the text / code section to the next 4k (page) boundary."
From here on, the rest of the script should be self-explanatory.
For more options, check "info ld".