I was hoping that someone could clear up a a couple questions for me.
I understand why a GOT is needed on 32bit, but why is it needed on 64bit? When I compile a simple library with a global variable, I see a relocation entry for that variable. Since x86_64 can do relative addressing, why doesn't the compiler just directly address the variable?
Also, when doing the relocation I see conflicting information for R_X86_64_GLOB_DAT and R_X86_64_JUMP_SLO. The specs say that the relocation is "S" or "Represents the value of the symbol whose index resides in the relocation entry". I read that as sym->st_value. Which doesn't make much sense to me because the compiler could just use that address itself since it clearly has it. In some of the code I read, I see mem + sym->st_value, where mem is the starting virtual address that the executable is loaded into. That makes more sense to me as it would mean your likely filling in an absolute address. Still not sure why that's needed on x86_64, but if that is the case, why does the documentation say "S"?
Thanks in advance,
- Rian
Why is a GOT needed for x86_64?
- BASICFreak
- Member
- Posts: 284
- Joined: Fri Jan 16, 2009 8:34 pm
- Location: Louisiana, USA
Re: Why is a GOT needed for x86_64?
I will attempt one of them:I was hoping that someone could clear up a a couple questions for me.
GOT is not needed by x86 nor x86_64, it is needed by the ELF format (and maybe another format.?.). The reason it is used in x86_64 (even though the relative addressing is present) is simple, let us say we want to load in 3 shared objects, unless these shared objects are at the exact location every time they are loaded - which is usually not the case - you will still need to keep track of where each global (function or variable) is located in the address range.I understand why a GOT is needed on 32bit, but why is it needed on 64bit? When I compile a simple library with a global variable, I see a relocation entry for that variable. Since x86_64 can do relative addressing, why doesn't the compiler just directly address the variable?
If I want to use fprintf where am I going to locate the function. If you give me a hard set offset, how am I to trust that the Position Independent Code is loaded Position Dependent? Remember shared objects are loaded in on run-time not compile-time.
But the GOT table can be completely removed from all ELFs, though it is a mess to do so - relocation tables on every EXEC and you will still need a table, though unlike GOT you can relocate the actual references in the code instead of calling / retrieving GOT value on every call to a Global. (I'm actually about to reverse this so I have shared object support, the right way)
I have not dealt with x86_64 ELF relocations yet so I cannot answer your question there (plus I'm to lazy to read the spec sheet to reply to it )
Hope this helped some, and hopefully someone here has more (better) information on the topic. (The odds are high)
BOS Source Thanks to GitHub
BOS Expanded Commentary
Both under active development!
BOS Expanded Commentary
Both under active development!
Sortie wrote:
- Don't play the role of an operating systems developer, be one.
- Be truly afraid of undefined [behavior].
- Your operating system should be itself, not fight what it is.
Re: Why is a GOT needed for x86_64?
My understanding of ELF is that for PIC, the start location for the library can change, but the relative offsets between the different sections remain the same, which is why the compiler should be able to figure out where each global symbol is (excluding external symbols of course). That is, the offset between sections like .text and .data is the same, even though the overall position of the library is different.
If you were to move the different sections within the library then that would make sense to me, but how the code find's it's GOT would not as I'm not sure how you would go about doing that without modifying the code itself.
If you were to move the different sections within the library then that would make sense to me, but how the code find's it's GOT would not as I'm not sure how you would go about doing that without modifying the code itself.
- Owen
- Member
- Posts: 1700
- Joined: Fri Jun 13, 2008 3:21 pm
- Location: Cambridge, United Kingdom
- Contact:
Re: Why is a GOT needed for x86_64?
There are four classes of ELF binary:
The GOT is also often used for intra-library calls and data references (e.g. calls of a function in a library from another function in the same library) because of traditional Unix symbol pre-emption rules (If your application defines a function "write", then everywhere a call to "write" is made that call should go to your application's version, even if another loaded library defines that symbol).
- Statically linked, position dependent (ET_EXEC with no DYNAMIC segment). These have no GOT and no shared library dependencies but are also not relocatable.
- Statically linked, position independent (ET_DYN, with a DYNAMIC segment, no DT_NEEDED entries; some tools may misidentify these as shared objects). These have no GOT and no shared library dependneices, but are relocatable
- Dynamicaly linked, position dependent (ET_EXEC with a DYNAMIC segment, with DT_NEEDED entries). These have a GOT and shared library dependencies.
- Dynamically linked, position independent (ET_DYN, with a DYNAMIC segment, with DT_NEEDED entries). These have a GOT and shared library dependencies and can be relocated.
The GOT is also often used for intra-library calls and data references (e.g. calls of a function in a library from another function in the same library) because of traditional Unix symbol pre-emption rules (If your application defines a function "write", then everywhere a call to "write" is made that call should go to your application's version, even if another loaded library defines that symbol).
- BASICFreak
- Member
- Posts: 284
- Joined: Fri Jan 16, 2009 8:34 pm
- Location: Louisiana, USA
Re: Why is a GOT needed for x86_64?
Ok, then what happens when the creator of the shared object adds functions and/or bug patchs - changing the offset of all functions after the change?rianquinn wrote:My understanding of ELF is that for PIC, the start location for the library can change, but the relative offsets between the different sections remain the same, which is why the compiler should be able to figure out where each global symbol is (excluding external symbols of course). That is, the offset between sections like .text and .data is the same, even though the overall position of the library is different.
If you were to move the different sections within the library then that would make sense to me, but how the code find's it's GOT would not as I'm not sure how you would go about doing that without modifying the code itself.
You could go through the shared object Sym Table and link all calls to (SO_OFFSET + SYM_OFFSET) or continue using GOT, which is (partly) setup at Compile-Time of the SO
Without GOT you have relocation - which is slower on load; but, quicker on run.
Now, do not get me wrong - I understand your point. But the only way to avoid GOT is to avoid ELF - or hackish ways.
I do dislike GOT myself, which is why I do not support it (yet) - but again this is a hackish way and the output of linking files is something along the lines of:
Code: Select all
LINKING BIN/TEST.ELF...
SRC/test.o: In function `init':
test.c:(.text+0x8): undefined reference to `initHeap'
test.c:(.text+0x74): undefined reference to `Bochs_puts'
SRC/test.o: In function `_JustATest':
test.c:(.text+0x88): undefined reference to `calloc'
test.c:(.text+0xa6): undefined reference to `Bochs_printf'
test.c:(.text+0xd3): undefined reference to `PCI_findByClass'
test.c:(.text+0xee): undefined reference to `PCI_getConfig'
test.c:(.text+0x10a): undefined reference to `Bochs_printf'
test.c:(.text+0x134): undefined reference to `Bochs_printf'
test.c:(.text+0x149): undefined reference to `Bochs_putch'
SRC/test.o: In function `_AnotherThread':
test.c:(.text+0x1a7): undefined reference to `calloc'
SRC/test.o: In function `_PIT_Test':
test.c:(.text+0x27e): undefined reference to `calloc'
BOS Source Thanks to GitHub
BOS Expanded Commentary
Both under active development!
BOS Expanded Commentary
Both under active development!
Sortie wrote:
- Don't play the role of an operating systems developer, be one.
- Be truly afraid of undefined [behavior].
- Your operating system should be itself, not fight what it is.
Re: Why is a GOT needed for x86_64?
The issue is that gcc doesn't know whether a function/object external to the current compilation unit (.c file) will eventually be defined in a different compilation unit in the same object or whether it will be in a separate library. Thus, if you ask it to produce position-independent code it assumes you are writing a shared library, and will thus have a GOT. The opcodes is chooses for each command are therefore already fixed to be ones which reference via a GOT. When you come to link, ld may spot that what you are actually accessing is within the same output file, but by this point it is too late - gcc has already outputted code to access the variable by the GOT so it has to be used.rianquinn wrote:My understanding of ELF is that for PIC, the start location for the library can change, but the relative offsets between the different sections remain the same, which is why the compiler should be able to figure out where each global symbol is (excluding external symbols of course). That is, the offset between sections like .text and .data is the same, even though the overall position of the library is different.
If you were to move the different sections within the library then that would make sense to me, but how the code find's it's GOT would not as I'm not sure how you would go about doing that without modifying the code itself.
You may want to investigate the '-r' option to ld - it essentially lets you combine lots of object files together into one large relocatable object file - i.e. it is an object file rather than an executable, so to load it as a kernel module (for example), you pretend to be ld, i.e. parse the section table (rather than the segment one), allocate memory for each section and patch up the relocations.BASICFreak wrote:I do dislike GOT myself, which is why I do not support it (yet) - but again this is a hackish way and the output of linking files is something along the lines of:
Regards,
John.
- BASICFreak
- Member
- Posts: 284
- Joined: Fri Jan 16, 2009 8:34 pm
- Location: Louisiana, USA
Re: Why is a GOT needed for x86_64?
Well, that's almost what I was (still am) doing:jnc100 wrote:You may want to investigate the '-r' option to ld - it essentially lets you combine lots of object files together into one large relocatable object file - i.e. it is an object file rather than an executable, so to load it as a kernel module (for example), you pretend to be ld, i.e. parse the section table (rather than the segment one), allocate memory for each section and patch up the relocations.BASICFreak wrote:I do dislike GOT myself, which is why I do not support it (yet) - but again this is a hackish way and the output of linking files is something along the lines of:
LD flags = "-melf_i386 -r -T../../linkLib.ld" for library (not shared object)
LD flags = "-melf_i386 -q --noinhibit-exec -T../../linkExec.ld" for EXEC (which creates the reloc table and allows undefined symbols)
Then when reading the head of the elf:
Code: Select all
if(Head->e_type == ET_EXEC) {
_VMM_newDIR(ELFPDir); // Create new PDIR
if(ELFPDir)
_LoadExecElf(ELFPDir, ELFLocation); // Starts a new process
} else if(Head->e_type == ET_REL) {
//Relocatable
_LoadRelocElf(ELFLocation); // Loads into shared memory
}
BOS Source Thanks to GitHub
BOS Expanded Commentary
Both under active development!
BOS Expanded Commentary
Both under active development!
Sortie wrote:
- Don't play the role of an operating systems developer, be one.
- Be truly afraid of undefined [behavior].
- Your operating system should be itself, not fight what it is.
Re: Why is a GOT needed for x86_64?
That was the answer I was looking for. That makes sense to me. To test that theory, I compiled the code with "static" in front of the variables / functions and watched them get removed from the GOT, so that goes well with your explanation.The issue is that gcc doesn't know whether a function/object external to the current compilation unit (.c file) will eventually be defined in a different compilation unit in the same object or whether it will be in a separate library. Thus, if you ask it to produce position-independent code it assumes you are writing a shared library, and will thus have a GOT.