Linker-script writers beware: COMMON Symbols

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
User avatar
Darwish
Posts: 21
Joined: Sat Oct 17, 2009 4:32 am

Linker-script writers beware: COMMON Symbols

Post by Darwish »

So I was writing some memory management test code to find that my kernel panic()s at different places once any of the text, data, or bss sections size exceed something close to 20 kilobytes.

After a day of debugging, I found that the compiler does not put my uninitialized globally-exported symbols under BSS; it marks them as "common symbols" instead. This has lead to disastrous effects cause I use the following innocent-looking segment in the linker script:

Code: Select all

SECTIONS {
          ...
          .bss : {
                  __bss_start = .;
                  *(EXCLUDE_FILE (*head.o *e820.o) .bss)
                  __bss_end = .;
           }
           __kernel_end = .;
}
Which meant that some of the kernel's uninitialized variables was not cleared by my clear_bss() method, leading to unpredictable behavior.

The best explanation of "common symbols" I found was from Ulrich Drepper's paper (archive) on shared libraries, mainly the following part:
Common variables are widely used in fortran, but they got used in C and C++ as well to work around mistakes of programmers. Since in the early days people used to drop the 'extern' keyword from variable definitions, in the same way it is possible to drop it from function declaration, the compiler often had multiple definitions of the same variable in different files.

To help the poor and clueless programmer, the C/C++ compiler normally generates common variables for uninitialized definitions such as `int foo;' For common variables, there can be more than one definition, and they all get unified in one location in the output file .. their values does not need to be stored in the ELF file.
There are also further information about those symbols from Ian Lance Taylor's articles series on linkers here (archive), on nm(1) manpage, and on ld(1) about LD's very useful `--warn-common' flag. You can also check some nice historical details from this binutils bugreport (archive). To let your BSS variable boundaries really reflect your entire BSS space, you either need to use GCC's '--no-common' parameter or modify the above script segment to:

Code: Select all

SECTIONS {
          ...
          .bss : {
                  __bss_start = .;
                  *(EXCLUDE_FILE (*head.o *e820.o) .bss)
                  *(EXCLUDE_FILE (*head.o *e820.o) COMMON)
                  __bss_end = .;
           }
           __kernel_end = .;
}
I've noticed that this wasn't mentiond on the WIKI, I'll wait for any corrections or comments, then add it there.
Thanks all :)
Last edited by Darwish on Sun Feb 07, 2010 4:49 pm, edited 2 times in total.
For latest news, please check my homepage and my blog.
—Darwish
User avatar
Owen
Member
Member
Posts: 1700
Joined: Fri Jun 13, 2008 3:21 pm
Location: Cambridge, United Kingdom
Contact:

Re: Linker-script writers beware: COMMON Symbols

Post by Owen »

Please don't use dark red for emphasis; that's what italic or bold are for, and some of us use white on dark themes.
User avatar
Darwish
Posts: 21
Joined: Sat Oct 17, 2009 4:32 am

Re: Linker-script writers beware: COMMON Symbols

Post by Darwish »

Owen wrote:Please don't use dark red for emphasis; that's what italic or bold are for, and some of us use white on dark themes.
No problem; fixed.
For latest news, please check my homepage and my blog.
—Darwish
Post Reply