ELF weirdness: Defined undefined symbols
Posted: Thu Sep 10, 2020 10:33 am
Hi all,
so I recently looked into function pointer equality for systems using ELF dynamic libraries. Specifically, if taking the address of a dynamic library's function is different between the non-PIC application and a PIC library. The good answer is: Yes, even under interposition.
But now to the hack making this work. My test case is this: lib.c:
appl1. c:
appl2.c:
If you compile appl1 and appl2 into non-PIC applications (these days you have to work at this), you will see that the pointer value for _os_fork is different between appl1 and appl2. So I wanted to find out how that works.
The library code must load the address of _os_fork through the GOT. So that explains how it can be so different between runs: The GOT gets filled differently between appl1 and appl2. But how? It turns out, appl1 contains in its dynsym section a symbol for _os_fork that is both undefined and has a value. That symbol is absent from appl2.
I took a look at musl's source code and found that such symbols will be used to satisfy calls to dlsym(), as well as all relocations except jump slots. And now I find myself wondering again:
so I recently looked into function pointer equality for systems using ELF dynamic libraries. Specifically, if taking the address of a dynamic library's function is different between the non-PIC application and a PIC library. The good answer is: Yes, even under interposition.
But now to the hack making this work. My test case is this: lib.c:
Code: Select all
#include <stdio.h>
int _os_fork(void){ return 0; }
void _os_exec(int (*ffunc)(), ...)
{
int eq = (ffunc == _os_fork);
printf("Pointers equal: %d\n", eq);
printf("Pointer given:\t%p\n", (void*)ffunc);
printf("Pointer sought:\t%p\n", (void*)_os_fork);
}
Code: Select all
extern int _os_fork(void);
extern void _os_exec(int (*)(),...);
int main(void)
{
_os_exec(_os_fork, 1, 2, 3);
return 0;
}
Code: Select all
extern int _os_fork(void);
extern void _os_exec(int (*)(),...);
int main(void)
{
_os_exec(main, 1, 2, 3);
return 0;
}
The library code must load the address of _os_fork through the GOT. So that explains how it can be so different between runs: The GOT gets filled differently between appl1 and appl2. But how? It turns out, appl1 contains in its dynsym section a symbol for _os_fork that is both undefined and has a value. That symbol is absent from appl2.
I took a look at musl's source code and found that such symbols will be used to satisfy calls to dlsym(), as well as all relocations except jump slots. And now I find myself wondering again:
- Where is that rule actually written down? I did not find it in the ABI supplements for i386 or AMD64 or any others I looked at, nor in the description of the ELF viewable at Oracle's homepage. Is that another ABI document I don't have?
- Should this information be added to the Wiki, and if so, where?