Getting symbols to be allocated & loadable with LD

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
User avatar
ThisMayWork
Member
Member
Posts: 65
Joined: Sat Mar 22, 2014 1:14 pm
Location: /bin

Getting symbols to be allocated & loadable with LD

Post by ThisMayWork »

Hello everyone,

I am having some LD trouble, and I think I could use some help here, because I feel like I am fighting against the toolchain, instead of working with it.

My goal is to have the .symtab and .strtab sections of my kernel binary to be loaded into memory, so that I can consult them while the kernel is running. The obvious first attempt was to modify my linker script to include the following SECTION entries:

Code: Select all

.symtab :
  {    
    sym_start = .; _sym_start = .; __sym_start = .;
    *(.symtab)
    sym_end   = .; _sym_end   = .; __sym_end   = .; 
  }

  .strtab :
  {
    str_start = .; _str_start = .; __str_start = .;
    *(.strtab)
    str_end   = .; _str_end   = .; __str_end   = .;
  }
This hopefully gathers the .strtab and .symtab sections of my individual object files into their respective sections in the final binary. Examining the final .strtab seems to confirm my assumption.

However, when examining the sections of my final binary with

Code: Select all

readelf -S
the aforementioned sections do not have any flags set, which means that they are not allocated or loadable and won't end up being loaded by GRUB (if I understand this correctly).

Also, the start and end symbols I have generated do not appear correct (start seems to equal end in all cases), as can be seen in the following screenshot.

https://imgur.com/bpS8FQr

The whole linker script is the following:

Code: Select all

/* Kernel linker script */

ENTRY(start)
SECTIONS
{
  .text 0x100000 :
  {
    code = .; _code = .; __code = .;
    *(.text)
    . = ALIGN(4096);
  }

  .data :
  {
     data = .; _data = .; __data = .;
     *(.data)
     *(.rodata)
     . = ALIGN(4096);
  }

  .bss :
  {
    bss = .; _bss = .; __bss = .;
    *(.bss)
    . = ALIGN(4096);
  }

  .symtab :
  {    
    sym_start = .; _sym_start = .; __sym_start = .;
    *(.symtab)
    sym_end   = .; _sym_end   = .; __sym_end   = .; 
  }

  .strtab :
  {
    str_start = .; _str_start = .; __str_start = .;
    *(.strtab)
    str_end   = .; _str_end   = .; __str_end   = .;
  }

  end = .; _end = .; __end = .;

  /DISCARD/ :
  {
    *(.eh_frame)
    *(.note.gnu.build-id)
    *(.comment)
  }
} 
CFLAGS are:

Code: Select all

-nostdlib -nostdinc -fno-builtin -fno-stack-protector -std=c11 -I ./inc -D DEBUG -g
I also attempted to change the section flags with objcopy, using

Code: Select all

objcopy --set-section-flags .symtab=contents,alloc,load,readonly,data ./bin/kernel
which appears to have no effect.

Thank you in advance for your help!

Best regards,
Nick
"Programming is an art form that fights back."
-Kudzu
User avatar
bzt
Member
Member
Posts: 1584
Joined: Thu Oct 13, 2016 4:55 pm
Contact:

Re: Getting symbols to be allocated & loadable with LD

Post by bzt »

Yep, this should work. I had the same problem too, so I'm also interested if anyone has a proper solution.
Just for the records, redirecting symtab into a loadable segment (like ".symtab { ... } :text") didn't work either.

Cheers,
bzt
songziming
Member
Member
Posts: 71
Joined: Fri Jun 28, 2013 1:48 am
Contact:

Re: Getting symbols to be allocated & loadable with LD

Post by songziming »

You can just put them into .data section, which is always loaded. Generating a map file also helps.

If your kernel is ELF and use GRUB for booting, then GRUB would load .symtab and .strtab automatically, and pass them via multiboot info. This is my approach.
Reinventing the Wheel, code: https://github.com/songziming/wheel
Korona
Member
Member
Posts: 1000
Joined: Thu May 17, 2007 1:27 pm
Contact:

Re: Getting symbols to be allocated & loadable with LD

Post by Korona »

.symtab (the section) is used for compile-time and/or debugging-only symbols (and tools like strip will therefore remove it from binaries/shared libraries). If you want the symbols for linking purposes, you could use the dynamic symbol table (for ELF binaries and shared libraries, this is available at runtime through the PT_DYNAMIC program header). You can pass `-E` (or `--export-dynamic`) to ld to force it to export all symbols to the dynamic symbol table. This table will not be removed by strip.
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
User avatar
ThisMayWork
Member
Member
Posts: 65
Joined: Sat Mar 22, 2014 1:14 pm
Location: /bin

Re: Getting symbols to be allocated & loadable with LD

Post by ThisMayWork »

songziming wrote:You can just put them into .data section, which is always loaded.
I tried doing this in my linker script

Code: Select all

.data :                                                                                                                                                                                               
 {                                                                                                                                                                                                     
   data = .; _data = .; __data = .;                                                                                                                                                                   
   *(.data)                                                                                                                                                                                           
   *(.rodata)                                                                                                                                                                                         
   *(.symtab)
   *(.strtab)                                                                                                                                                                                         
   . = ALIGN(4096);                                                                                                                                                                                   
   }  
which I believe should place both symbols and string table in the data section, but examining it with readelf showed otherwise.
songziming wrote:Generating a map file also helps.
Regarding the above, the generated map file would need to be passed to my kernel either as a multiboot module or as a file, which is not ideal. I have ramdisk support, but it can be disabled on demand, so I would prefer to have symbols loaded in memory by GRUB as part of my ELF kernel file.
"Programming is an art form that fights back."
-Kudzu
User avatar
ThisMayWork
Member
Member
Posts: 65
Joined: Sat Mar 22, 2014 1:14 pm
Location: /bin

Re: Getting symbols to be allocated & loadable with LD

Post by ThisMayWork »

Korona wrote:.symtab (the section) is used for compile-time and/or debugging-only symbols (and tools like strip will therefore remove it from binaries/shared libraries). If you want the symbols for linking purposes, you could use the dynamic symbol table (for ELF binaries and shared libraries, this is available at runtime through the PT_DYNAMIC program header). You can pass `-E` (or `--export-dynamic`) to ld to force it to export all symbols to the dynamic symbol table. This table will not be removed by strip.
My original post did not really specify my goals correctly here, but I want to have the symtab and strtab loaded in memory so that the kernel can consult them when displaying stack traces. I have no support for dynamic linking, so I thought the primitive static tables would suffice. Also, I am not explicitly stripping anything from the ELF, other than the discarded sections in my linker script.

It appears that ld and objcopy from my toolchain explicitly deny marking the above sections as loadable, and also won't place them in any loadable section (such as .data), but these are merely assumptions backed by my (incomplete) testing. I have not managed to find any relevant documentation. I could spend some time looking through the source code to see if anything related is explicitly hardcoded, but I don't know if that would be of any value.
"Programming is an art form that fights back."
-Kudzu
User avatar
ThisMayWork
Member
Member
Posts: 65
Joined: Sat Mar 22, 2014 1:14 pm
Location: /bin

Re: Getting symbols to be allocated & loadable with LD

Post by ThisMayWork »

bzt wrote: Just for the records, redirecting symtab into a loadable segment (like ".symtab { ... } :text") didn't work either.
I can confirm this as well, it seems that the symtab section (or anything containing it) is prevented from being included in a loadable section or segment at all, and I am thinking this might be due to its type (readelf reports it as SYMTAB). The same appears to be true for the strtab which is marked as type STRTAB.

I can see why these sections should not be marked loadable by default, but explicitly changing their respective flags with objcopy seems to silently fail as well.
"Programming is an art form that fights back."
-Kudzu
User avatar
iansjack
Member
Member
Posts: 4705
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Getting symbols to be allocated & loadable with LD

Post by iansjack »

Why can't you just read these sections from the file?
User avatar
ThisMayWork
Member
Member
Posts: 65
Joined: Sat Mar 22, 2014 1:14 pm
Location: /bin

Re: Getting symbols to be allocated & loadable with LD

Post by ThisMayWork »

iansjack wrote:Why can't you just read these sections from the file?
My goal is to have them loaded in memory by GRUB, along with my .text, .data and .bss sections, so that the runtime can parse the symbols during stack trace generation. For example, when unwinding the stack, I end up with certain addresses in .text. By consulting a symbol table I can look up which function the address belongs to, and print it by (symbol) name, so that I can facilitate debugging. The issue with this is that GRUB implements the ELF specification correctly, and only loads in memory sections marked as loaded (or loadable, not sure on the correct name). My current toolchain however, does not allow me to mark the .symtab and .strtab sections as loadable, no matter what I tried, as described in my previous posts.

In order to read these from the file, I would have to have the kernel file loaded and handed to me as a multiboot module, or as part of the ramdisk, or through a filesystem, but the ramdisk can be disabled at will (and I don't feel like making the ramdisk required for nice stack traces) and I have no working filesystem implementation. It feels a lot more natural to kindly ask GRUB to simply copy the symbol table in memory, and I believe it would happily do so, as long as the toolchain allowed me to mark it loadable.
"Programming is an art form that fights back."
-Kudzu
User avatar
iansjack
Member
Member
Posts: 4705
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Getting symbols to be allocated & loadable with LD

Post by iansjack »

I would have thought that it would be far more productive to use an external debugger - such as gdb - rather than trying to build the debugging into the kernel in this way. That way you have access to everything you need. You will even be able to see the source code and single-step through it; also you can examine memory and set watches on memory locations to check when they are being altered. This is far more informative than unravelling stack traces.

At some stage you are going to want to be able to read files from disk. It might be more productive to work on that rather than providing detailed stack traces at this time.
User avatar
ThisMayWork
Member
Member
Posts: 65
Joined: Sat Mar 22, 2014 1:14 pm
Location: /bin

Re: Getting symbols to be allocated & loadable with LD

Post by ThisMayWork »

iansjack wrote:I would have thought that it would be far more productive to use an external debugger - such as gdb - rather than trying to build the debugging into the kernel in this way. That way you have access to everything you need. You will even be able to see the source code and single-step through it; also you can examine memory and set watches on memory locations to check when they are being altered. This is far more informative than unravelling stack traces.
I already have working support for GDB through QEMU, and I also have limited support for serial, so I could write a GDB stub to get it to work outside QEMU, but this does not help when debugging crashes that are not easy to reproduce. By unwinding stack traces correctly and logging that info, I can perform (very basic, but somewhat helpful) post-mortem debugging.
iansjack wrote:At some stage you are going to want to be able to read files from disk. It might be more productive to work on that rather than providing detailed stack traces at this time.
Properly designing the VFS, then implementing a driver for a filesystem, then carefully abstracting away the underlying medium, then implementing a driver for it seems like too much work to get fancy traces but I am definitely going to start working on it ASAP. I mainly started this thread believing that my issue was easy to work around, but since posting this I have taken a look at more mature kernels to see how they get around it and I have come to believe that most people do some custom post-processing on their symbols to pass them to the kernel, either by file or by a multiboot module.

It would be nice however, if we manage to identify the cause, to report this issue to the Binutils upstream repo.
"Programming is an art form that fights back."
-Kudzu
Korona
Member
Member
Posts: 1000
Joined: Thu May 17, 2007 1:27 pm
Contact:

Re: Getting symbols to be allocated & loadable with LD

Post by Korona »

If you just want it for debugging, an option is to load the section in your bootloader (GRUB already supports it IIRC) and pass a pointer to it to your kernel.
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
songziming
Member
Member
Posts: 71
Joined: Fri Jun 28, 2013 1:48 am
Contact:

Re: Getting symbols to be allocated & loadable with LD

Post by songziming »

The map file is for you to see what goes into each section, doesn't need to be loaded.

You said grub only load segments marked as loadable, which is not true. Symbol table and string table are also loaded by grub, their address saved in multiboot info (see multiboot spec for more detail). But you have to backup symtab and strtab before setting up page allocator, because you might corrupt those data.
Reinventing the Wheel, code: https://github.com/songziming/wheel
Post Reply