Page 1 of 1

Symbol lookup and dynamic linking

Posted: Sat Oct 04, 2014 8:37 am
by Synon
I'm just curious, what mechanism do you use in your kernel for mapping symbol names to addresses and vice versa, e.g. for tracing the stack or dynamically linking modules with the kernel?

At the moment, I have a quite awkward method. The ELF .symtab section doesn't get loaded at runtime (at least, it's not supposed to); instead, .dynsyms is supposed to be loaded, but my kernel doesn't have a .dynsyms section. Instead, what I do is I compile and link the kernel and run a shell script on the executable. The script generates a list of all global symbols with nm and generates source code containing a map of symbol names to addresses. The generated file is then compiled and the entire kernel is linked again, making the symbol mapping available at runtime. I haven't got around to testing it yet because I have some bugs to work out, but I have a suspicion that it won't work because the linker might put some symbols in a different place when the kernel is linked the second time.

What do you do?

Re: Symbol lookup and dynamic linking

Posted: Sat Oct 04, 2014 10:32 am
by dansmahajan
You could pass -XLinker -Map=<kernel.map> to your linker to generate a map file containing symbols information.

Re: Symbol lookup and dynamic linking

Posted: Sat Oct 04, 2014 10:40 am
by Synon
dansmahajan wrote:You could pass -XLinker -Map=<kernel.map> to your linker to generate a map file containing symbols information.
Does that get linked into the kernel, or would the kernel have to load and parse it like the maps that nm generates?

Re: Symbol lookup and dynamic linking

Posted: Sat Oct 04, 2014 12:58 pm
by dansmahajan
Synon wrote:
dansmahajan wrote:You could pass -XLinker -Map=<kernel.map> to your linker to generate a map file containing symbols information.
Does that get linked into the kernel, or would the kernel have to load and parse it like the maps that nm generates?
No no it won't be linked, kernel still has to load and parse, like in case of nm.
Synon wrote:I'm just curious, what mechanism do you use in your kernel for mapping symbol names to addresses and vice versa, e.g. for tracing the stack or dynamically linking modules with the kernel?

At the moment, I have a quite awkward method. The ELF .symtab section doesn't get loaded at runtime (at least, it's not supposed to); instead, .dynsyms is supposed to be loaded, but my kernel doesn't have a .dynsyms section. Instead, what I do is I compile and link the kernel and run a shell script on the executable. The script generates a list of all global symbols with nm and generates source code containing a map of symbol names to addresses. The generated file is then compiled and the entire kernel is linked again, making the symbol mapping available at runtime. I haven't got around to testing it yet because I have some bugs to work out, but I have a suspicion that it won't work because the linker might put some symbols in a different place when the kernel is linked the second time.

What do you do?
Yes you are right after second linking symbols address will be different as new .text and .data sections are being linked. In second linking you are still parsing the file, right ?? Then you can load an external map file and then parse, this substantially reduces your kernel size and the symbols are brought in memory when they are really required. AFAIK linux also creates a map file and loads it at runtime.

Re: Symbol lookup and dynamic linking

Posted: Sat Oct 04, 2014 2:27 pm
by Synon
dansmahajan wrote:Yes you are right after second linking symbols address will be different as new .text and .data sections are being linked.
Ok. I had hoped to get around that using a "dummy" symbol table.
dansmahajan wrote:In second linking you are still parsing the file, right ??
I parse it with a script in the first link. The script generates and compiles C source file containing an array of structures representing the symbols. The object file then gets linked into the kernel the second time around. To avoid the undefined reference from trying to compile code using the symbol table, I wrote a "dummy" symbol table which is the same structure, but doesn't contain any records.
dansmahajan wrote:Then you can load an external map file and then parse, this substantially reduces your kernel size and the symbols are brought in memory when they are really required. AFAIK linux also creates a map file and loads it at runtime.
I'll pre-process it and get GRUB to load it as a module so I can use it for debugging. Can't load it myself as my memory manager is a WIP and I have no drivers or VFS.

[edit] Isn't this method a bit, well, trusting? What happens if the user accidentally (or an attack intentionally) creates an incorrect map? The kernel has no way to verify it without having another source. It could be created by an attacker in a way that allows execution of arbitrary code. Especially in an open source kernel like Linux - how do they guard against it?

Re: Symbol lookup and dynamic linking

Posted: Sun Oct 05, 2014 2:32 am
by Combuster
Synon wrote:[edit] Isn't this method a bit, well, trusting? What happens if the user accidentally (or an attack intentionally) creates an incorrect map? The kernel has no way to verify it without having another source. It could be created by an attacker in a way that allows execution of arbitrary code. Especially in an open source kernel like Linux - how do they guard against it?
Isn't it any more trusting than downloading a precompiled kernel and hoping it's not been tampered with? SSL exists for a reason.

Re: Symbol lookup and dynamic linking

Posted: Fri Mar 27, 2015 1:09 pm
by Synon
U wrote:Isn't this method a bit, well, trusting? What happens if the user accidentally (or an attack intentionally) creates an incorrect map? The kernel has no way to verify it without having another source. It could be created by an attacker in a way that allows execution of arbitrary code. Especially in an open source kernel like Linux - how do they guard against it?
I had an idea (possibly already in use) to solve this with asymmetric key cryptography. Here's how I imagine it working.
  • At compile-time:
    1. Kernel is compiled and linked as usual
    2. A symbol table that the kernel can parse is generated
    3. An initial ramdisk containing the symbol table as well as any modules is built
    4. The ramdisk and kernel are compressed and encrypted You have some options here (non-exhaustive):
      1. Kernel and ramdisk compressed and encrypted as separate files
      2. Kernel and ramdisk are compressed separately and concatenated, and the whole file is encrypted
      3. Kernel and ramdisk are concatenated, then compressed, and then encrypted (we'll call this a blob)
      Personally I want to go with option (c), so let's continue with that assumption.
    5. Compress & encrypt the kernel and ramdisk using the private key, which is then discarded
    6. Build a bootstrap stub with the public key compiled into its executable code (or passed as a multiboot module)
    7. Configure bootloader, etc.
  • At runtime:
    1. Booloader loads your bootstrap as a kernel and the blob as a module
    2. Bootstrap performs minimal setup (GDT, IDT, reading the memory map, etc.) and then extracts the blob into its component parts (if you opt to keep the kernel and ramdisk separate, you could have the kernel decrypt the ramdisk with yet another key)
Advantages:
  • Only those who possess the private key could modify the contents of the initrd and kernel after compiling, reducing possible attack vectors.
Neutral/debetable:
  • It seems like it would take a long time, but Linux has to be relocated and decompressed without being noticeably slow to boot on an average PC. The addition of cryptography shouldn't make too much difference IMO.
Disadvantages:
  • Attackers can still read or modify initrd/kernel contents at runtime, after they are decoded/decompressed.
This concept would permeate the system, like OpenBSD or Tals: everything is encrypted, Tor is build into the system, only SSL connections are allowed, VPN use is strongly encouraged (if I ever got far enough to implement a network stack...)

Are there any known/existing hobbyist or professional operating systems with a boot procedure like this?