linker script info

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
one

linker script info

Post by one »

SECTIONS
{
.text 0x100000 : {
code = .; _code = .; __code = .;
*(.text)
. = ALIGN(4096);
}
.data : {
data = .; _data = .; __data = .;
*(.data)
. = ALIGN(4096);
}
.bss :
{
bss = .; _bss = .; __bss = .;
*(.bss)
. = ALIGN(4096);
}
end = .; _end = .; __end = .;
}

just wanted to know if any of you have the time to explain what all this actually does and means. Any help appreciated
nullify

RE:

Post by nullify »

The linker script allows you to organize how your output is generated. Within the SECTIONS { } block, you see the three primary sections of your kernel binary - text, data, and bss. In the .text { } block, we have:
The One wrote: code = .; _code = .; __code = .;
*(.text)
. = ALIGN(4096);
The first line exports the symbols code, _code, and __code to contain the address of the beginning of the text section. The next line tells the linker to insert the kernel text section at that position. The third line makes the current position page aligned for the next section (the data section, in this case).

The data and bss blocks are similar. At the end of this linker script, some symbols are exported to indicate the end of the kernel memory.

You can access these exported symbols in your kernel, which may come in handy during memory management.

The address 0x100000 prior to the text block informs the linker that the text section should start at that address.
User avatar
Solar
Member
Member
Posts: 7615
Joined: Thu Nov 16, 2006 12:01 pm
Location: Germany
Contact:

Re:linker script info

Post by Solar »

I'll try a detailed "for dummies" explanation - not because I think you are a dummy, but because I faced the same confusion initially. ;-)

A binary basically consists of three parts:

* text - executable code.
* data - preinitialized variables.
* bss - a reserved space that will be assigned in RAM by the loader, but is not present in the binary since it's always initialized to 0.

Consider this pseudo-code:

Code: Select all

int a = 42;
// doing something here
The code for doing something would go to the text section, the 42 would go to the data section.

To explain bss is a bit tricky. Best use an example: In my kernel image (which is a multiboot compilant ELF), I have a 4k sized bss section. Those 4k don't show up in the binary, but the GRUB loader reserves those 4k when loading the image into RAM. I use them for an early kernel stack, until I have my own memory map / management set up.

There are usually more sections in a compiled binary, but those three will suffice for the early stages. Use objdump or comparable tools to pry apart your binaries to check what else is in there.

Now, let's say you have a couple of .o files, from assembling ASM code and compiling C code. Each of those .o files has sections. The linker's job is to merge those sections together so they form an executable.

Usually, the linker has a default script on how to do it, but as always, kernel space is different, so we write our own script.

Code: Select all

SECTIONS
{
"The following will define what pieces from the input files will constitute the sections of the linked binary."

Code: Select all

  .text  0x100000 : {
"First, we'll set up a text (code) section, assuming that it will loaded to address 0x100000."

(The address is important so the linker can relocate any references in the .o files.)

Code: Select all

    code = .; _code = .; __code = .;
"Define these three symbols in the symbol table, each of them holding the current addres (0x100000)."

We might use them later. In my code, I define a symbol at the beginning and the end of my bss section, so that my assembly code can refer to those symbols when setting up the stack. Another use would be to define symbols for the beginning of the .ctors and .dtors section so you can call global constructors / destructors from your code - but that's C++ specific.

Code: Select all

    *(.text)
"Insert, starting at the current address (0x100000), any .text sections from any input files."

Code: Select all

    . = ALIGN(4096);
  }
"Pad the text / code section to the next 4k (page) boundary."

From here on, the rest of the script should be self-explanatory.

For more options, check "info ld".
Every good solution is obvious once you've found it.
User avatar
Pype.Clicker
Member
Member
Posts: 5964
Joined: Wed Oct 18, 2006 2:31 am
Location: In a galaxy, far, far away
Contact:

Re:linker script info

Post by Pype.Clicker »

in a nutshell,

Code: Select all

    int x=4;
puts x in the .data section, while

Code: Select all

    int x;
puts it in the .bss section.

Of course, this only applies for global (or static local) variables, not to so-called automatic variable that are allocated in the running stackframe.

Finally,

Code: Select all

char *str="Hello";
puts str in the .data section while the string (i.e. bytes 'H e l l 0 \00') are going to .rodata, .text or .rodata.str.32 (or something alike) depending on your compiler and binary format.
User avatar
Solar
Member
Member
Posts: 7615
Joined: Thu Nov 16, 2006 12:01 pm
Location: Germany
Contact:

Re:linker script info

Post by Solar »

Erm... so, forget what I said about .bss being initialized to 0. As we all (hopefully) know,

Code: Select all

int x;
leaves the value of x undefined.
Every good solution is obvious once you've found it.
Post Reply