eekee wrote: ↑Sat Dec 21, 2024 6:21 am
As it maps both of these, does at least 1 need to be position-independent? Oh but the 'interpreter' could relocate the ELF file, couldn't it? I didn't understand relocateability back then, so it didn't get stored in my memory.
In theory, both the interpreter and the main executable might be position dependent, just linked to different addresses. We are setting up a new address space, after all, so the entire user half is fair game. Indeed, all libraries might be position dependent as well, if we're getting down to it. Position dependent only means that the loader tries to get the requested addresses, and it is an error if it can't get them (typically, the loader sets the first argument of mmap() to the requested address and doesn't set MAP_FIXED, and errors out if the ELF type is ET_EXEC and the return value of mmap() is unequal to the requested address).
In practice of course, these days all modules are position independent. The advent of x86_64 has brought PC-relative addressing to the masses, which massively reduces the code overhead for PIC, and one of the security mitigations used to make successful exploits of the broken programs we all use on a daily basis less likely is ASLR, which works better if all modules are position independent.
This does mean that the interpreter is run with no relocations being processed, so it has to first process its own relocations. Having a position dependent interpreter would ease that pain, but obviously block off a section of addresses space just for ease of implementation, and I don't think that is a good tradeoff.
To illustrate the point about code size of PIC code: Let's take a simple C function:
Code: Select all
extern int glob_var;
void set(int x) { glob_var = x; }
In i386 position-independent mode, this is compiled into something like
Code: Select all
set:
call 1f # push run-time address to stack
1:
popl %ecx # get run-time address into ECX
addl $__GLOBAL_OFFSET_TABLE__-1b, %ecx # get ECX to run-time address of global offset table
movl glob_var@GOT(%ecx), %ecx # load pointer to glob_var into ECX
movl 4(%esp), %eax # load value to set into EAX
movl %eax, (%ecx) # actually set the glob_var
retl
Whereas on x86_64, it is just
Code: Select all
set:
movq glob_var@GOTPCREL(%rip), %rax # load pointer to glob_var into RAX
movq %rdi, (%rax) # set the glob_var
retq
Of course, in both cases it is even shorter in position dependent mode. But in that case, you pay with more data: The global variable then gets a COPY relocation in the main executable. So the main executable then increases the size of its own .bss section and interposes the global variable for all other modules.