Page 1 of 1

ELF weirdness: Defined undefined symbols

Posted: Thu Sep 10, 2020 10:33 am
by nullplan
Hi all,

so I recently looked into function pointer equality for systems using ELF dynamic libraries. Specifically, if taking the address of a dynamic library's function is different between the non-PIC application and a PIC library. The good answer is: Yes, even under interposition.

But now to the hack making this work. My test case is this: lib.c:

Code: Select all

#include <stdio.h>
int _os_fork(void){ return 0; }
void _os_exec(int (*ffunc)(), ...)
{
  int eq = (ffunc == _os_fork);
  printf("Pointers equal: %d\n", eq);
  printf("Pointer given:\t%p\n", (void*)ffunc);
  printf("Pointer sought:\t%p\n", (void*)_os_fork);
}
appl1. c:

Code: Select all

extern int _os_fork(void);
extern void _os_exec(int (*)(),...);
int main(void)
{
  _os_exec(_os_fork, 1, 2, 3);
  return 0;
}
appl2.c:

Code: Select all

extern int _os_fork(void);
extern void _os_exec(int (*)(),...);
int main(void)
{
  _os_exec(main, 1, 2, 3);
  return 0;
}
If you compile appl1 and appl2 into non-PIC applications (these days you have to work at this), you will see that the pointer value for _os_fork is different between appl1 and appl2. So I wanted to find out how that works.

The library code must load the address of _os_fork through the GOT. So that explains how it can be so different between runs: The GOT gets filled differently between appl1 and appl2. But how? It turns out, appl1 contains in its dynsym section a symbol for _os_fork that is both undefined and has a value. That symbol is absent from appl2.

I took a look at musl's source code and found that such symbols will be used to satisfy calls to dlsym(), as well as all relocations except jump slots. And now I find myself wondering again:
  1. Where is that rule actually written down? I did not find it in the ABI supplements for i386 or AMD64 or any others I looked at, nor in the description of the ELF viewable at Oracle's homepage. Is that another ABI document I don't have?
  2. Should this information be added to the Wiki, and if so, where?

Re: ELF weirdness: Defined undefined symbols

Posted: Mon Sep 14, 2020 11:47 am
by PeterX
nullplan wrote:Is that another ABI document I don't have?
Maybe this one:
https://uclibc.org/docs/elf-64-gen.pdf

Re: ELF weirdness: Defined undefined symbols

Posted: Mon Sep 14, 2020 12:23 pm
by nullplan
No, that just describes what an ELF file looks like. But it doesn't say what it means if a symbol is undefined but has a nonzero value. Neither in the section about the symbol table, nor in the one about relocations.

Re: ELF weirdness: Defined undefined symbols

Posted: Mon Sep 14, 2020 2:27 pm
by bzt
nullplan wrote:If you compile appl1 and appl2 into non-PIC applications (these days you have to work at this), you will see that the pointer value for _os_fork is different between appl1 and appl2. So I wanted to find out how that works.
You've just run into the deep hole of PLTs :-)
nullplan wrote:The library code must load the address of _os_fork through the GOT.
Nope, the GOT does not contain the function's pointer. Instead it contains a pointer to a local function, which in turn has a jump to another GOT entry, the dynamic linker, which will be replaced with the actual function's address after the first call. This is called lazy linking.
nullplan wrote:Is that another ABI document I don't have?
This has nothing to do with the ABI, this is a dynamic linkage hack implemented by gcc. This is totally unnecessary for the x86_64 (as it can encode RIP-relative GOT pointers), only required on i386. But gcc developers were lazy, so all shared library function calls works on x86_64 the same way as on i386.
nullplan wrote:Should this information be added to the Wiki, and if so, where?
It is already on the wiki, see Dynamic Linker. But that's not the best page I've wrote I admit. Feel free to expand it!

Cheers,
bzt