Getting symbols to be allocated & loadable with LD

ThisMayWork · Post by **ThisMayWork** » Sun Sep 08, 2019 5:29 am

Hello everyone,

I am having some LD trouble, and I think I could use some help here, because I feel like I am fighting against the toolchain, instead of working with it.

My goal is to have the .symtab and .strtab sections of my kernel binary to be loaded into memory, so that I can consult them while the kernel is running. The obvious first attempt was to modify my linker script to include the following SECTION entries:

Code: Select all

.symtab :
  {    
    sym_start = .; _sym_start = .; __sym_start = .;
    *(.symtab)
    sym_end   = .; _sym_end   = .; __sym_end   = .; 
  }

  .strtab :
  {
    str_start = .; _str_start = .; __str_start = .;
    *(.strtab)
    str_end   = .; _str_end   = .; __str_end   = .;
  }

This hopefully gathers the .strtab and .symtab sections of my individual object files into their respective sections in the final binary. Examining the final .strtab seems to confirm my assumption.

However, when examining the sections of my final binary with

Code: Select all

readelf -S

the aforementioned sections do not have any flags set, which means that they are not allocated or loadable and won't end up being loaded by GRUB (if I understand this correctly).

Also, the start and end symbols I have generated do not appear correct (start seems to equal end in all cases), as can be seen in the following screenshot.

https://imgur.com/bpS8FQr

The whole linker script is the following:

Code: Select all

/* Kernel linker script */

ENTRY(start)
SECTIONS
{
  .text 0x100000 :
  {
    code = .; _code = .; __code = .;
    *(.text)
    . = ALIGN(4096);
  }

  .data :
  {
     data = .; _data = .; __data = .;
     *(.data)
     *(.rodata)
     . = ALIGN(4096);
  }

  .bss :
  {
    bss = .; _bss = .; __bss = .;
    *(.bss)
    . = ALIGN(4096);
  }

  .symtab :
  {    
    sym_start = .; _sym_start = .; __sym_start = .;
    *(.symtab)
    sym_end   = .; _sym_end   = .; __sym_end   = .; 
  }

  .strtab :
  {
    str_start = .; _str_start = .; __str_start = .;
    *(.strtab)
    str_end   = .; _str_end   = .; __str_end   = .;
  }

  end = .; _end = .; __end = .;

  /DISCARD/ :
  {
    *(.eh_frame)
    *(.note.gnu.build-id)
    *(.comment)
  }
}

CFLAGS are:

Code: Select all

-nostdlib -nostdinc -fno-builtin -fno-stack-protector -std=c11 -I ./inc -D DEBUG -g

I also attempted to change the section flags with objcopy, using

Code: Select all

objcopy --set-section-flags .symtab=contents,alloc,load,readonly,data ./bin/kernel

which appears to have no effect.

Thank you in advance for your help!

Best regards,
Nick

bzt · Post by **bzt** » Sun Sep 08, 2019 12:59 pm

Yep, this should work. I had the same problem too, so I'm also interested if anyone has a proper solution.
Just for the records, redirecting symtab into a loadable segment (like ".symtab { ... } :text") didn't work either.

Cheers,
bzt

songziming · Post by **songziming** » Sun Sep 08, 2019 8:54 pm

You can just put them into .data section, which is always loaded. Generating a map file also helps.

If your kernel is ELF and use GRUB for booting, then GRUB would load .symtab and .strtab automatically, and pass them via multiboot info. This is my approach.

Korona · Post by **Korona** » Sun Sep 08, 2019 11:27 pm

.symtab (the section) is used for compile-time and/or debugging-only symbols (and tools like strip will therefore remove it from binaries/shared libraries). If you want the symbols for linking purposes, you could use the dynamic symbol table (for ELF binaries and shared libraries, this is available at runtime through the PT_DYNAMIC program header). You can pass `-E` (or `--export-dynamic`) to ld to force it to export all symbols to the dynamic symbol table. This table will not be removed by strip.

ThisMayWork · Post by **ThisMayWork** » Mon Sep 09, 2019 1:56 am

songziming wrote:You can just put them into .data section, which is always loaded.

I tried doing this in my linker script

Code: Select all

.data :                                                                                                                                                                                               
 {                                                                                                                                                                                                     
   data = .; _data = .; __data = .;                                                                                                                                                                   
   *(.data)                                                                                                                                                                                           
   *(.rodata)                                                                                                                                                                                         
   *(.symtab)
   *(.strtab)                                                                                                                                                                                         
   . = ALIGN(4096);                                                                                                                                                                                   
   }

which I believe should place both symbols and string table in the data section, but examining it with readelf showed otherwise.

songziming wrote:Generating a map file also helps.

Regarding the above, the generated map file would need to be passed to my kernel either as a multiboot module or as a file, which is not ideal. I have ramdisk support, but it can be disabled on demand, so I would prefer to have symbols loaded in memory by GRUB as part of my ELF kernel file.

ThisMayWork · Post by **ThisMayWork** » Mon Sep 09, 2019 2:00 am

Korona wrote:.symtab (the section) is used for compile-time and/or debugging-only symbols (and tools like strip will therefore remove it from binaries/shared libraries). If you want the symbols for linking purposes, you could use the dynamic symbol table (for ELF binaries and shared libraries, this is available at runtime through the PT_DYNAMIC program header). You can pass `-E` (or `--export-dynamic`) to ld to force it to export all symbols to the dynamic symbol table. This table will not be removed by strip.

My original post did not really specify my goals correctly here, but I want to have the symtab and strtab loaded in memory so that the kernel can consult them when displaying stack traces. I have no support for dynamic linking, so I thought the primitive static tables would suffice. Also, I am not explicitly stripping anything from the ELF, other than the discarded sections in my linker script.

It appears that ld and objcopy from my toolchain explicitly deny marking the above sections as loadable, and also won't place them in any loadable section (such as .data), but these are merely assumptions backed by my (incomplete) testing. I have not managed to find any relevant documentation. I could spend some time looking through the source code to see if anything related is explicitly hardcoded, but I don't know if that would be of any value.

ThisMayWork · Post by **ThisMayWork** » Mon Sep 09, 2019 2:04 am

bzt wrote: Just for the records, redirecting symtab into a loadable segment (like ".symtab { ... } :text") didn't work either.

I can confirm this as well, it seems that the symtab section (or anything containing it) is prevented from being included in a loadable section or segment at all, and I am thinking this might be due to its type (readelf reports it as SYMTAB). The same appears to be true for the strtab which is marked as type STRTAB.

I can see why these sections should not be marked loadable by default, but explicitly changing their respective flags with objcopy seems to silently fail as well.

iansjack · Post by **iansjack** » Mon Sep 09, 2019 2:54 am

Why can't you just read these sections from the file?

ThisMayWork · Post by **ThisMayWork** » Mon Sep 09, 2019 3:30 am

iansjack wrote:Why can't you just read these sections from the file?

My goal is to have them loaded in memory by GRUB, along with my .text, .data and .bss sections, so that the runtime can parse the symbols during stack trace generation. For example, when unwinding the stack, I end up with certain addresses in .text. By consulting a symbol table I can look up which function the address belongs to, and print it by (symbol) name, so that I can facilitate debugging. The issue with this is that GRUB implements the ELF specification correctly, and only loads in memory sections marked as loaded (or loadable, not sure on the correct name). My current toolchain however, does not allow me to mark the .symtab and .strtab sections as loadable, no matter what I tried, as described in my previous posts.

In order to read these from the file, I would have to have the kernel file loaded and handed to me as a multiboot module, or as part of the ramdisk, or through a filesystem, but the ramdisk can be disabled at will (and I don't feel like making the ramdisk required for nice stack traces) and I have no working filesystem implementation. It feels a lot more natural to kindly ask GRUB to simply copy the symbol table in memory, and I believe it would happily do so, as long as the toolchain allowed me to mark it loadable.

iansjack · Post by **iansjack** » Mon Sep 09, 2019 3:56 am

I would have thought that it would be far more productive to use an external debugger - such as gdb - rather than trying to build the debugging into the kernel in this way. That way you have access to everything you need. You will even be able to see the source code and single-step through it; also you can examine memory and set watches on memory locations to check when they are being altered. This is far more informative than unravelling stack traces.

At some stage you are going to want to be able to read files from disk. It might be more productive to work on that rather than providing detailed stack traces at this time.

ThisMayWork · Post by **ThisMayWork** » Mon Sep 09, 2019 4:07 am

iansjack wrote:I would have thought that it would be far more productive to use an external debugger - such as gdb - rather than trying to build the debugging into the kernel in this way. That way you have access to everything you need. You will even be able to see the source code and single-step through it; also you can examine memory and set watches on memory locations to check when they are being altered. This is far more informative than unravelling stack traces.

I already have working support for GDB through QEMU, and I also have limited support for serial, so I could write a GDB stub to get it to work outside QEMU, but this does not help when debugging crashes that are not easy to reproduce. By unwinding stack traces correctly and logging that info, I can perform (very basic, but somewhat helpful) post-mortem debugging.

iansjack wrote:At some stage you are going to want to be able to read files from disk. It might be more productive to work on that rather than providing detailed stack traces at this time.

Properly designing the VFS, then implementing a driver for a filesystem, then carefully abstracting away the underlying medium, then implementing a driver for it seems like too much work to get fancy traces but I am definitely going to start working on it ASAP. I mainly started this thread believing that my issue was easy to work around, but since posting this I have taken a look at more mature kernels to see how they get around it and I have come to believe that most people do some custom post-processing on their symbols to pass them to the kernel, either by file or by a multiboot module.

It would be nice however, if we manage to identify the cause, to report this issue to the Binutils upstream repo.

Korona · Post by **Korona** » Mon Sep 09, 2019 10:31 am

If you just want it for debugging, an option is to load the section in your bootloader (GRUB already supports it IIRC) and pass a pointer to it to your kernel.

songziming · Post by **songziming** » Mon Sep 09, 2019 8:57 pm

The map file is for you to see what goes into each section, doesn't need to be loaded.

You said grub only load segments marked as loadable, which is not true. Symbol table and string table are also loaded by grub, their address saved in multiboot info (see multiboot spec for more detail). But you have to backup symtab and strtab before setting up page allocator, because you might corrupt those data.

OSDev.org

Getting symbols to be allocated & loadable with LD

Getting symbols to be allocated & loadable with LD

Re: Getting symbols to be allocated & loadable with LD

Re: Getting symbols to be allocated & loadable with LD

Re: Getting symbols to be allocated & loadable with LD

Re: Getting symbols to be allocated & loadable with LD

Re: Getting symbols to be allocated & loadable with LD

Re: Getting symbols to be allocated & loadable with LD

Re: Getting symbols to be allocated & loadable with LD

Re: Getting symbols to be allocated & loadable with LD

Re: Getting symbols to be allocated & loadable with LD

Re: Getting symbols to be allocated & loadable with LD

Re: Getting symbols to be allocated & loadable with LD

Re: Getting symbols to be allocated & loadable with LD