Page 1 of 1

OS Specific Toolchain noob question

Posted: Sat May 12, 2012 8:36 pm
by justin
I am following the http://wiki.osdev.org/OS_Specific_Toolchain tutorial. I have built binutils and gcc and now I'm trying to compile some user mode programs to run on my os. Before I built binutils, I set

Code: Select all

TEXT_START_ADDR=0x401000
as was described in the tutorial. However, readelf shows that 0x401094 is the start of the .text section in an executable produced by my new toolchain. What accounts for this discrepancy?

BTW, I haven't built newlib yet and right now I'm manually compiling and linking in some startup code (which is being placed at 0x401094).

Re: OS Specific Toolchain noob question

Posted: Sun May 13, 2012 12:54 am
by xenos
Have you checked what is in the 0x94 bytes right before the start of your .text section?

Re: OS Specific Toolchain noob question

Posted: Sun May 13, 2012 3:34 am
by jnc100
From ld/scripttempl/elf.sc:
TEXT_START_ADDR - the first byte of the text segment, after any headers.

The first bytes of the output file will be the ELF headers, which are then followed by the first section (which is defined to be .text by the ld elf script). This means that, in your case, the text section starts at offset 0x94 in the file, so that if the whole file is loaded aligned to a certain physical page (lets say 0x1000), then you can easily map the text section to the right place by mapping physical 0x1000 -> virtual 0x401000 (and so on until the end of the text section), and this will cause offset 0x94 in the file to be mapped at 0x401094, as expected and all absolute addresses in the code will work as intended.

The other way to do it would to be align the text section on a 0x1000 boundary in the file. Then we would have the elf headers at offset 0 in the file and the .text section at offset 0x1000, mapped to virtual address 0x401000. Whilst this appears to make more sense initially, actually it is wasting a large amount of space in the original file (between the end of the elf headers and the start of the text section).

If you look at the output of readelf -S (or objdump -h) on a elf binary for linux (e.g. /bin/ls) you will see that the text section is also not aligned on a page boundary.

If you wish to find out where to start running code from in the binary then simply check the entry address in the elf headers.

Regards,
John.

Re: OS Specific Toolchain noob question

Posted: Sun May 13, 2012 4:16 am
by bluemoon
In the end you'll need a program interpreter(program loader) to deal with the program segments and relocate them accordingly, then payload in file offset does not require alignement.