Page 1 of 1
LD writes different LMA/VMA
Posted: Sat Dec 16, 2017 10:16 am
by henje
While working on my build system I encountered a problem when using ld. I used the following simple linker script which only specifies a virtual address, but when i use readelf virtual and load address for .rodata are different.
Code: Select all
OUTPUT_FORMAT("elf32-i386")
ENTRY(_start)
SECTIONS
{
. = 0x100000;
.text : {
*(multiboot)
*(.text)
}
.data ALIGN(4096) : {
*(.data)
}
.rodata ALIGN(4096) : {
*(.rodata)
}
.bss ALIGN(4096) : {
*(.bss)
}
}
Code: Select all
Elf file type is EXEC (Executable file)
Entry point 0x100090
There are 4 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x001000 0x00100000 0x00100000 0x0009e 0x0009e R E 0x1000
LOAD 0x00109e 0x0010009e 0x00102ebe 0x0000d 0x0000d R 0x1000
LOAD 0x002000 0x00101000 0x00103e20 0x00000 0x02000 RW 0x1000
GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RWE 0x10
The code I am using is just a minimal "Hello World!" kernel. When linking with gold or lld virtual and load address of .rodata is the same. I am not sure if my linker script is at fault, the ld invocation or ld is just bugging around. But I would assume it is my script.
I invoke ld like this:
Code: Select all
ld -T linker.ld *.o -o kernel -melf_i386
(I do not know why ld is the only linker to ignore OUTPUT_FORMAT)
Thanks for any help.
Re: LD writes different LMA/VMA
Posted: Sat Dec 16, 2017 12:00 pm
by MichaelPetch
It could be because you are not using a cross compiler. Maybe it is related position independent code. I'd recommend building an i686 cross compiler and using that to see if it changes. It can be dependent on the distro you are using (different default compiler options etc). What command line paramaters do you use to compile the source code?
Does that layout cause problems for your code? If you posted your project to github (or similar) and told us what distro/OS you are using to build on we might be able to say.
Re: LD writes different LMA/VMA
Posted: Sat Dec 16, 2017 2:20 pm
by henje
I uploaded the relevant parts of my project to
https://github.com/Henje/LD-issue-minimal-example. I use clang as a cross-compiler, but I do not see how that is relevant to linking. Moreover, when using gold and lld, LMA and VMA are equal and my code works.
The ld I am using is the standard ld on my Ubuntu 17.10. It says about itself:
Code: Select all
GNU ld (GNU Binutils for Ubuntu) 2.29.1
Supported emulations:
elf_x86_64
elf32_x86_64
elf_i386
elf_iamcu
i386linux
elf_l1om
elf_k1om
i386pep
i386pe
Re: LD writes different LMA/VMA
Posted: Sat Dec 16, 2017 2:24 pm
by MichaelPetch
The relevance of the compiler to linking is that that the compiler can emit information into the object files that can alter how things are placed in memory (things like alignment etc) by the linker. You'll also get differing results if your compiler happens to default to Position Independent code vs the default in other distros and cross compilers where code isn't position independent. (Ubuntu made this type of change around 16.04). Using a host compiler can make a difference in the output you see. Using a cross compiler can give you more consistent results for your builds in general.
Edit: It appears you are using clang (and not gcc). clang will cross compile. Problem is your original question didn't say what tools you were using and I assumed gcc incorrectly.
Re: LD writes different LMA/VMA
Posted: Sat Dec 16, 2017 6:09 pm
by henje
It is not like I could not use the other linkers, I am just curious as to why there is a difference in the first place. I see your point with the position independent code but the manual of ld does not even feature the term. From the linkers perspective only sections and symbols are of interest. From a compiler's view, PIC just disallows absolute jumps and the like. At the time of linkage those are all generated. Then again, I am no expert at PIC so I might as well be wrong.
I tried linking with --nmagic, but the output did not change much. If it helps I attached the output of objdump -x.
Code: Select all
kernel4: file format elf32-i386
kernel4
architecture: i386, flags 0x00000012:
EXEC_P, HAS_SYMS
start address 0x00100090
Program Header:
LOAD off 0x000000a0 vaddr 0x00100000 paddr 0x00100000 align 2**4
filesz 0x0000009e memsz 0x0000009e flags r-x
LOAD off 0x0000013e vaddr 0x0010009e paddr 0x00102ebe align 2**0
filesz 0x0000000d memsz 0x00002f62 flags rw-
STACK off 0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**4
filesz 0x00000000 memsz 0x00000000 flags rwx
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 0000009e 00100000 00100000 000000a0 2**4
CONTENTS, ALLOC, LOAD, READONLY, CODE
1 .rodata.str1.1 0000000d 0010009e 00102ebe 0000013e 2**0
CONTENTS, ALLOC, LOAD, READONLY, DATA
2 .bss 00002000 00101000 00103e20 0000014b 2**0
ALLOC
3 .comment 0000002d 00000000 00000000 0000014b 2**0
CONTENTS, READONLY
SYMBOL TABLE:
00100000 l d .text 00000000 .text
0010009e l d .rodata.str1.1 00000000 .rodata.str1.1
00101000 l d .bss 00000000 .bss
00000000 l d .comment 00000000 .comment
00000000 l df *ABS* 00000000 start.o
0010009a l .text 00000000 _stop
00103000 l .bss 00000000 kernel_stack
00000000 l df *ABS* 00000000 main.cpp
00000000 l df *ABS* 00000000
001000a0 l O .rodata.str1.1 00000000 _GLOBAL_OFFSET_TABLE_
00100090 g .text 00000000 _start
00100070 g F .text 0000001f init
00100010 g F .text 0000005a _Z5printPKc
What boggles my mind is the load address which the linker calculates. I can see no relation to any code.
Re: LD writes different LMA/VMA
Posted: Sat Dec 16, 2017 10:47 pm
by MichaelPetch
Appears that clang maintains .rodata sections that may have trailing characters on the name. In the linker script you should use
*(.rodata*) instead. With LD linker you should consider aligning the Load Memory Address (to the right of a colon on a section definition) to 4K if you want the LMA and VMA to match up. If you set the VMA (value to the left of the colon in the section definition), the LMA remains untouched. If you set both LMA and VMA in a section definition they are set separately. In your case you want to modify your
linker.ld to look like:
Code: Select all
OUTPUT_FORMAT("elf32-i386")
ENTRY(_start)
SECTIONS
{
. = 0x100000;
.text : ALIGN(4096) {
*(multiboot)
*(.text)
}
.data : ALIGN(4096) {
*(.data)
}
.rodata : ALIGN (4096) {
*(.rodata*)
}
.bss : ALIGN (4096) {
*(.bss)
}
}
The linkers may create the PHDRS differently, so to see the individual sections using LD you may want to use
objdump -x kernel to view the full headers. The output is more readable than
readelf IMHO Modify your linker line to add -nostartfiles and -nostdlib. We don't have C runtime initialization nor do we have standard library support. The command could look like this:
Code: Select all
ld -Tlinker.ld -nostartfiles -nostdlib *.o -o kernel -melf_i386
Be aware that if you are going to use C++ you will need to enhance the linkers script to deal with static construct and destructors. Your assembly code would have to loop through that data and call all the static constructiors. If you ever put class objects at global scope for example, to have them initialized these constructors have to be called. Normally the startupfiles do that, but since we are in a freestanding environment it is up to us to do that ourselves. I believe there is a forum post or OSDev wiki discussing this.
Re: LD writes different LMA/VMA
Posted: Tue Dec 19, 2017 9:46 am
by henje
Thanks for your help, setting the LMA instead of VMA resulted in the right result for all linkers I tested with. As for the explanation, I am not so sure because the LD manual states "LMA is set so the difference between the VMA and LMA is the same as the difference between the VMA and LMA of the last section" (from
here). There are other options but they boil down to "LMA is set to its VMA". This behaviour is especially weird, because I tested a different LD (2.23.1) which had no problem.
Also good catch with the .rodata regex, it kind of worked, but was not what I intended.
As for the C++ part, thanks for the heads up, but in my real project I got that handled. This was just a test for a different build system and had therefore no ctor, dtor stuff.