OSDev.org

The Place to Start for Operating System Developers
It is currently Thu Mar 28, 2024 4:32 am

All times are UTC - 6 hours




Post new topic Reply to topic  [ 7 posts ] 
Author Message
 Post subject: LD writes different LMA/VMA
PostPosted: Sat Dec 16, 2017 10:16 am 
Offline

Joined: Sat Dec 16, 2017 9:58 am
Posts: 4
While working on my build system I encountered a problem when using ld. I used the following simple linker script which only specifies a virtual address, but when i use readelf virtual and load address for .rodata are different.

Code:
OUTPUT_FORMAT("elf32-i386")

ENTRY(_start)

SECTIONS
{
   . = 0x100000;

   .text : {
      *(multiboot)
      *(.text)
   }
   .data ALIGN(4096) : {
      *(.data)
   }
   .rodata ALIGN(4096) : {
      *(.rodata)
   }
   .bss ALIGN(4096) : {
      *(.bss)
   }
}


Code:
Elf file type is EXEC (Executable file)
Entry point 0x100090
There are 4 program headers, starting at offset 52

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x001000 0x00100000 0x00100000 0x0009e 0x0009e R E 0x1000
  LOAD           0x00109e 0x0010009e 0x00102ebe 0x0000d 0x0000d R   0x1000
  LOAD           0x002000 0x00101000 0x00103e20 0x00000 0x02000 RW  0x1000
  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RWE 0x10


The code I am using is just a minimal "Hello World!" kernel. When linking with gold or lld virtual and load address of .rodata is the same. I am not sure if my linker script is at fault, the ld invocation or ld is just bugging around. But I would assume it is my script.

I invoke ld like this:
Code:
ld -T linker.ld *.o -o kernel -melf_i386

(I do not know why ld is the only linker to ignore OUTPUT_FORMAT)

Thanks for any help.


Top
 Profile  
 
 Post subject: Re: LD writes different LMA/VMA
PostPosted: Sat Dec 16, 2017 12:00 pm 
Offline
Member
Member

Joined: Fri Aug 26, 2016 1:41 pm
Posts: 671
It could be because you are not using a cross compiler. Maybe it is related position independent code. I'd recommend building an i686 cross compiler and using that to see if it changes. It can be dependent on the distro you are using (different default compiler options etc). What command line paramaters do you use to compile the source code?

Does that layout cause problems for your code? If you posted your project to github (or similar) and told us what distro/OS you are using to build on we might be able to say.


Top
 Profile  
 
 Post subject: Re: LD writes different LMA/VMA
PostPosted: Sat Dec 16, 2017 2:20 pm 
Offline

Joined: Sat Dec 16, 2017 9:58 am
Posts: 4
I uploaded the relevant parts of my project to https://github.com/Henje/LD-issue-minimal-example. I use clang as a cross-compiler, but I do not see how that is relevant to linking. Moreover, when using gold and lld, LMA and VMA are equal and my code works.

The ld I am using is the standard ld on my Ubuntu 17.10. It says about itself:

Code:
GNU ld (GNU Binutils for Ubuntu) 2.29.1
  Supported emulations:
   elf_x86_64
   elf32_x86_64
   elf_i386
   elf_iamcu
   i386linux
   elf_l1om
   elf_k1om
   i386pep
   i386pe


Top
 Profile  
 
 Post subject: Re: LD writes different LMA/VMA
PostPosted: Sat Dec 16, 2017 2:24 pm 
Offline
Member
Member

Joined: Fri Aug 26, 2016 1:41 pm
Posts: 671
The relevance of the compiler to linking is that that the compiler can emit information into the object files that can alter how things are placed in memory (things like alignment etc) by the linker. You'll also get differing results if your compiler happens to default to Position Independent code vs the default in other distros and cross compilers where code isn't position independent. (Ubuntu made this type of change around 16.04). Using a host compiler can make a difference in the output you see. Using a cross compiler can give you more consistent results for your builds in general.

Edit: It appears you are using clang (and not gcc). clang will cross compile. Problem is your original question didn't say what tools you were using and I assumed gcc incorrectly.


Top
 Profile  
 
 Post subject: Re: LD writes different LMA/VMA
PostPosted: Sat Dec 16, 2017 6:09 pm 
Offline

Joined: Sat Dec 16, 2017 9:58 am
Posts: 4
It is not like I could not use the other linkers, I am just curious as to why there is a difference in the first place. I see your point with the position independent code but the manual of ld does not even feature the term. From the linkers perspective only sections and symbols are of interest. From a compiler's view, PIC just disallows absolute jumps and the like. At the time of linkage those are all generated. Then again, I am no expert at PIC so I might as well be wrong.

I tried linking with --nmagic, but the output did not change much. If it helps I attached the output of objdump -x.

Code:
kernel4:     file format elf32-i386
kernel4
architecture: i386, flags 0x00000012:
EXEC_P, HAS_SYMS
start address 0x00100090

Program Header:
    LOAD off    0x000000a0 vaddr 0x00100000 paddr 0x00100000 align 2**4
         filesz 0x0000009e memsz 0x0000009e flags r-x
    LOAD off    0x0000013e vaddr 0x0010009e paddr 0x00102ebe align 2**0
         filesz 0x0000000d memsz 0x00002f62 flags rw-
   STACK off    0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**4
         filesz 0x00000000 memsz 0x00000000 flags rwx

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         0000009e  00100000  00100000  000000a0  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  1 .rodata.str1.1 0000000d  0010009e  00102ebe  0000013e  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  2 .bss          00002000  00101000  00103e20  0000014b  2**0
                  ALLOC
  3 .comment      0000002d  00000000  00000000  0000014b  2**0
                  CONTENTS, READONLY
SYMBOL TABLE:
00100000 l    d  .text   00000000 .text
0010009e l    d  .rodata.str1.1   00000000 .rodata.str1.1
00101000 l    d  .bss   00000000 .bss
00000000 l    d  .comment   00000000 .comment
00000000 l    df *ABS*   00000000 start.o
0010009a l       .text   00000000 _stop
00103000 l       .bss   00000000 kernel_stack
00000000 l    df *ABS*   00000000 main.cpp
00000000 l    df *ABS*   00000000
001000a0 l     O .rodata.str1.1   00000000 _GLOBAL_OFFSET_TABLE_
00100090 g       .text   00000000 _start
00100070 g     F .text   0000001f init
00100010 g     F .text   0000005a _Z5printPKc


What boggles my mind is the load address which the linker calculates. I can see no relation to any code.


Top
 Profile  
 
 Post subject: Re: LD writes different LMA/VMA
PostPosted: Sat Dec 16, 2017 10:47 pm 
Offline
Member
Member

Joined: Fri Aug 26, 2016 1:41 pm
Posts: 671
Appears that clang maintains .rodata sections that may have trailing characters on the name. In the linker script you should use *(.rodata*) instead. With LD linker you should consider aligning the Load Memory Address (to the right of a colon on a section definition) to 4K if you want the LMA and VMA to match up. If you set the VMA (value to the left of the colon in the section definition), the LMA remains untouched. If you set both LMA and VMA in a section definition they are set separately. In your case you want to modify your linker.ld to look like:
Code:
OUTPUT_FORMAT("elf32-i386")

ENTRY(_start)
SECTIONS
{
        . = 0x100000;
        .text : ALIGN(4096) {
                *(multiboot)
                *(.text)
        }
        .data : ALIGN(4096) {
                *(.data)
        }
        .rodata : ALIGN (4096) {
                *(.rodata*)
        }
        .bss : ALIGN (4096) {
                *(.bss)
        }
}
The linkers may create the PHDRS differently, so to see the individual sections using LD you may want to use objdump -x kernel to view the full headers. The output is more readable than readelf IMHO Modify your linker line to add -nostartfiles and -nostdlib. We don't have C runtime initialization nor do we have standard library support. The command could look like this:
Code:
ld -Tlinker.ld -nostartfiles -nostdlib *.o -o kernel -melf_i386


Be aware that if you are going to use C++ you will need to enhance the linkers script to deal with static construct and destructors. Your assembly code would have to loop through that data and call all the static constructiors. If you ever put class objects at global scope for example, to have them initialized these constructors have to be called. Normally the startupfiles do that, but since we are in a freestanding environment it is up to us to do that ourselves. I believe there is a forum post or OSDev wiki discussing this.


Top
 Profile  
 
 Post subject: Re: LD writes different LMA/VMA
PostPosted: Tue Dec 19, 2017 9:46 am 
Offline

Joined: Sat Dec 16, 2017 9:58 am
Posts: 4
Thanks for your help, setting the LMA instead of VMA resulted in the right result for all linkers I tested with. As for the explanation, I am not so sure because the LD manual states "LMA is set so the difference between the VMA and LMA is the same as the difference between the VMA and LMA of the last section" (from here). There are other options but they boil down to "LMA is set to its VMA". This behaviour is especially weird, because I tested a different LD (2.23.1) which had no problem.

Also good catch with the .rodata regex, it kind of worked, but was not what I intended.

As for the C++ part, thanks for the heads up, but in my real project I got that handled. This was just a test for a different build system and had therefore no ctor, dtor stuff.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 7 posts ] 

All times are UTC - 6 hours


Who is online

Users browsing this forum: Bing [Bot], Google [Bot], nullpointer and 68 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group