I don't want this way, i want my kernel to be independent position.XenOS wrote:First of all, your KERNEL_VMA should rather be 0xC0000000, not 0x100000. Further, to have your sections at a different virtual and physical location, you will need to give the physical address using the AT directive. See this linker script for an example:
http://wiki.osdev.org/Higher_Half_x86_B ... #linker.ld
triple fault when use data from bss
-
- Member
- Posts: 51
- Joined: Sun Mar 01, 2015 7:58 am
Re: triple fault when use data from bss
-
- Member
- Posts: 51
- Joined: Sun Mar 01, 2015 7:58 am
Re: triple fault when use data from bss
It's working. Thank you very much.simeonz wrote:Apparently, you have to pass "-pic" or "-pie" option to the final ld. I am confused by this, actually.TheLittleWho wrote:So, bss is mapped virtual, my problem is that when I access the global variable from bss, it's accessed physical address instead of virtual address...
The linker can bypass GOT when the output is ET_EXEC. The original relocation, which is R_386_GOT32, is semantically converted to R_386_GOTOFF. Then it is preapplied and elided, which is possible, because the relative virtual layout is fixed in ET_EXEC and the offset of any variable from GOT is constant. Additionally a mov opcode in the relocated instruction has to be converted converted to lea opcode during link time. This is where the lea instruction in the assembly comes from. You can check the intermediate object file and see that it is still a mov instruction before the optimization.
The problem in your case is that the optimization is used only for .data, whilst for .bss the GOT entries are computed by the linker in terms of the fixed virtual addresses, thus causing the .bss variables to use GOT, but with fixed entries (and no corresponding R_386_GLOB_DAT relocations as with "-shared"). As a final resort you could have used "-q" to keep the relocations and processed them before switching paging on. Anyway, passing "-pic" to ld suddenly triggers GOT elision for both .data and .bss.
I also couldn't reproduce the relative addressing technique from your code. That is, with pic, normally the get_pc_thunk function is used to retrieve the current eip, because the return address is not a fixed quantity. I tried calling a static function and expected some kind of call graph analysis to show that it is called from one place, but the compiler still uses get_pc_thunk.
-
- Member
- Posts: 51
- Joined: Sun Mar 01, 2015 7:58 am
Re: triple fault when use data from bss
Unfortunately, I found a new problem. I have the code:
For the y variable will print the physical address, but for x variable will print the virtual address.
Note that the paging is enabled, the kernel is mapped at 0xC0000000, and my kernel should be position independent, but in this case something is wrong. (the context is same as before )
Code: Select all
int x, y;
void print(void *addr)
{
kprintf("%p\n", addr);
}
void f1()
{
x = 0;
print(&x);
}
void f2()
{
print(&y);
}
void kernel_main()
{
f1();
f2();
}
Note that the paging is enabled, the kernel is mapped at 0xC0000000, and my kernel should be position independent, but in this case something is wrong. (the context is same as before )
Re: triple fault when use data from bss
Couldn't reproduce the problem. I confess, I modified the snippet ever so slightly, like this:
I built it like this ("-Wl,-q" is irrelevant):
And that was the resulting assembly:
Note the accesses on code address 23f and 26f. They use lea. This basically tells you that the access is completely eip-relative and does not use GOT. So, everything is fine here.
Could you compile the above code with the above gcc commands on your system and check if the accesses use lea on your end. If they do, then please paste the exact command line from your original build (that you use to build your version of the code) and the resulting f1 and f2 disassembly (from your version of the code), to check the results and options side by side. If the gcc options match, but the disassembly differs, then there are differences in gcc that are impacting the result.
Another way to check is to use "objdump -r" (which is possible due to the "-Wl,-q", that you should not otherwise use in your builds). If the result contains R_386_GOT32, then you are toast. If the accesses to x and y use R_386_GOTOFF relocations and there are no R_386_GOT32 relocations, then you should be ok.
My gcc version is 4.8.5, binutils version is 2.23 (.52.0.1). This is on a CentOS vm.
Code: Select all
int x, y;
void print(void *addr)
{
volatile void *out;
out = addr;
}
void f1()
{
x = 0;
print(&x);
}
void f2()
{
print(&y);
}
void _start()
{
f1();
f2();
}
Code: Select all
#gcc -m32 -std=gnu11 -ffreestanding -O2 -Wall -Wextra -fpic -c test.c -o test.o
gcc -m32 -O2 -nostdlib -z max-page-size=0x1000 test.o -Wl,-pic -Wl,-q -o test
Code: Select all
00000220 <print>:
220: f3 c3 repz ret
222: 8d b4 26 00 00 00 00 lea 0x0(%esi,%eiz,1),%esi
229: 8d bc 27 00 00 00 00 lea 0x0(%edi,%eiz,1),%edi
00000230 <f1>:
230: 53 push %ebx
231: e8 78 00 00 00 call 2ae <__x86.get_pc_thunk.bx>
236: 81 c3 ca 1d 00 00 add $0x1dca,%ebx
23c: 83 ec 18 sub $0x18,%esp
23f: 8d 83 0c 00 00 00 lea 0xc(%ebx),%eax
245: c7 00 00 00 00 00 movl $0x0,(%eax)
24b: 89 04 24 mov %eax,(%esp)
24e: e8 cd ff ff ff call 220 <print>
253: 83 c4 18 add $0x18,%esp
256: 5b pop %ebx
257: c3 ret
258: 90 nop
259: 8d b4 26 00 00 00 00 lea 0x0(%esi,%eiz,1),%esi
00000260 <f2>:
260: 53 push %ebx
261: e8 48 00 00 00 call 2ae <__x86.get_pc_thunk.bx>
266: 81 c3 9a 1d 00 00 add $0x1d9a,%ebx
26c: 83 ec 18 sub $0x18,%esp
26f: 8d 83 10 00 00 00 lea 0x10(%ebx),%eax
275: 89 04 24 mov %eax,(%esp)
278: e8 a3 ff ff ff call 220 <print>
27d: 83 c4 18 add $0x18,%esp
280: 5b pop %ebx
281: c3 ret
282: 8d b4 26 00 00 00 00 lea 0x0(%esi,%eiz,1),%esi
289: 8d bc 27 00 00 00 00 lea 0x0(%edi,%eiz,1),%edi
00000290 <_start>:
290: 53 push %ebx
291: e8 18 00 00 00 call 2ae <__x86.get_pc_thunk.bx>
296: 81 c3 6a 1d 00 00 add $0x1d6a,%ebx
29c: 83 ec 08 sub $0x8,%esp
29f: e8 8c ff ff ff call 230 <f1>
2a4: e8 b7 ff ff ff call 260 <f2>
2a9: 83 c4 08 add $0x8,%esp
2ac: 5b pop %ebx
2ad: c3 ret
000002ae <__x86.get_pc_thunk.bx>:
2ae: 8b 1c 24 mov (%esp),%ebx
2b1: c3 ret
Could you compile the above code with the above gcc commands on your system and check if the accesses use lea on your end. If they do, then please paste the exact command line from your original build (that you use to build your version of the code) and the resulting f1 and f2 disassembly (from your version of the code), to check the results and options side by side. If the gcc options match, but the disassembly differs, then there are differences in gcc that are impacting the result.
Another way to check is to use "objdump -r" (which is possible due to the "-Wl,-q", that you should not otherwise use in your builds). If the result contains R_386_GOT32, then you are toast. If the accesses to x and y use R_386_GOTOFF relocations and there are no R_386_GOT32 relocations, then you should be ok.
My gcc version is 4.8.5, binutils version is 2.23 (.52.0.1). This is on a CentOS vm.
-
- Member
- Posts: 51
- Joined: Sun Mar 01, 2015 7:58 am
Re: triple fault when use data from bss
That is the code:
gcc flags: -std=gnu11 -ffreestanding -O2 -Wall -Wextra -fpic -m32
ld flags: -O2 -nostdlib -z max-page-size=0x1000 -pic
I'm using gcc and ld from here (http://wiki.osdev.org/GCC_Cross-Compiler) : i686-elf-gcc and i686-elf-ld
I can't use -Wl or -m32 for ld...
Code: Select all
001000d0 <print>:
1000d0: 53 push %ebx
1000d1: e8 e2 48 00 00 call 1049b8 <linker_textEnd>
1000d6: 81 c3 a2 61 00 00 add $0x61a2,%ebx
1000dc: 83 ec 10 sub $0x10,%esp
1000df: 8d 83 ca f5 ff ff lea -0xa36(%ebx),%eax
1000e5: ff 74 24 18 pushl 0x18(%esp)
1000e9: 50 push %eax
1000ea: e8 21 44 00 00 call 104510 <kprintf>
1000ef: 83 c4 18 add $0x18,%esp
1000f2: 5b pop %ebx
1000f3: c3 ret
1000f4: 8d b6 00 00 00 00 lea 0x0(%esi),%esi
1000fa: 8d bf 00 00 00 00 lea 0x0(%edi),%edi
00100100 <f1>:
100100: 53 push %ebx
100101: e8 b2 48 00 00 call 1049b8 <linker_textEnd>
100106: 81 c3 72 61 00 00 add $0x6172,%ebx
10010c: 83 ec 14 sub $0x14,%esp
10010f: ff b3 e4 ff ff ff pushl -0x1c(%ebx)
100115: e8 b6 ff ff ff call 1000d0 <print>
10011a: 83 c4 18 add $0x18,%esp
10011d: 5b pop %ebx
10011e: c3 ret
10011f: 90 nop
00100120 <f2>:
100120: 53 push %ebx
100121: e8 92 48 00 00 call 1049b8 <linker_textEnd>
100126: 81 c3 52 61 00 00 add $0x6152,%ebx
10012c: 83 ec 14 sub $0x14,%esp
10012f: 8d 83 08 7b 00 00 lea 0x7b08(%ebx),%eax
100135: c7 00 00 00 00 00 movl $0x0,(%eax)
10013b: 50 push %eax
10013c: e8 8f ff ff ff call 1000d0 <print>
100141: 83 c4 18 add $0x18,%esp
100144: 5b pop %ebx
100145: c3 ret
100146: 8d 76 00 lea 0x0(%esi),%esi
100149: 8d bc 27 00 00 00 00 lea 0x0(%edi,%eiz,1),%edi
00100150 <kernel_main>:
100150: 53 push %ebx
100151: e8 62 48 00 00 call 1049b8 <linker_textEnd>
100156: 81 c3 22 61 00 00 add $0x6122,%ebx
10015c: 83 ec 08 sub $0x8,%esp
10015f: e8 9c ff ff ff call 100100 <f1>
100164: e8 b7 ff ff ff call 100120 <f2>
100169: 83 c4 08 add $0x8,%esp
10016c: 5b pop %ebx
10016d: c3 ret
10016e: 66 90 xchg %ax,%ax
ld flags: -O2 -nostdlib -z max-page-size=0x1000 -pic
I'm using gcc and ld from here (http://wiki.osdev.org/GCC_Cross-Compiler) : i686-elf-gcc and i686-elf-ld
I can't use -Wl or -m32 for ld...
Re: triple fault when use data from bss
All right. I built gcc and binutils as described in the wiki, just to be sure that we are on the same page, although it is just a more bare bones install (that lacks certain libraries) with particular target architecture enabled.
The really important difference are the versions - I used gcc 7.1.0 and binutils 2.29. Apparently changes were made. The behavior is as it was always advertised to be, but some compiler and linker options have stricter effect. The "-fpic" and "-pic" options generate PIC code suitable for shared libraries only, whereas "-fpie" and "-pie" are for executables. It didn't use to matter so much on the older versions, but makes sense, since shared libraries have relocated GOT entries, whilst executables do not, and the compiler have to generate appropriate code.
Anyway. You need to change "-fpic" to "-fpie" and "-pic" to "-pie".
I also wanted to mention, that as an alternative, the linux kernel decompressor relocates itself by adding the displacement offset to the GOT entries. That is, without full processing of relocation information, this quick and dirty approach is supposed to fix the GOT by just adding a constant to every word there. The proper options above should eliminate the need to do that, as well as GOT usage in general, but if someday you get in a real fix you may try that technique.
The really important difference are the versions - I used gcc 7.1.0 and binutils 2.29. Apparently changes were made. The behavior is as it was always advertised to be, but some compiler and linker options have stricter effect. The "-fpic" and "-pic" options generate PIC code suitable for shared libraries only, whereas "-fpie" and "-pie" are for executables. It didn't use to matter so much on the older versions, but makes sense, since shared libraries have relocated GOT entries, whilst executables do not, and the compiler have to generate appropriate code.
Anyway. You need to change "-fpic" to "-fpie" and "-pic" to "-pie".
I also wanted to mention, that as an alternative, the linux kernel decompressor relocates itself by adding the displacement offset to the GOT entries. That is, without full processing of relocation information, this quick and dirty approach is supposed to fix the GOT by just adding a constant to every word there. The proper options above should eliminate the need to do that, as well as GOT usage in general, but if someday you get in a real fix you may try that technique.
-
- Member
- Posts: 51
- Joined: Sun Mar 01, 2015 7:58 am
Re: triple fault when use data from bss
Unfortunately, that not working...but if I don't use O2 flag, it's working. I will do update for gcc and binutils and I hope that everything will be ok.
Re: triple fault when use data from bss
Sorry, don't bother. I have to admit that I temporarily disabled the optimizations to avoid inlining and forgot to enable them after the problem got fixed (because it manifested without optimizations as well.) You may have built another gcc for nothing.
Looking at the -O2 output (after using noinline, etc.), I can see that the executable is generated correctly, but expects dynamic relocation processing, which you don't have. In particular, there is one dynamic relocation pointing into GOT. It should be theoretically possible to generate position independent code without fixups, but I don't know how to coerce the compiler.
I will dig some more, but if noone else posts otherwise, we can assume that you are supposed to either fully process the relocations or try the GOT trick that the linux kernel decompressor uses. In both cases, I can see how this may not be what you originally aimed for with PIC.
Here is what I've got with O2:
P.S.: If you are wondering what the goal of PIC on Linux is, it is to allow relocation fixup entirely outside the code segment. This enables sharing the code pages between multiple processes. However, apparently creating relocation-free code is either not a priority, or is produced with another set of options that I am not aware of.
Looking at the -O2 output (after using noinline, etc.), I can see that the executable is generated correctly, but expects dynamic relocation processing, which you don't have. In particular, there is one dynamic relocation pointing into GOT. It should be theoretically possible to generate position independent code without fixups, but I don't know how to coerce the compiler.
I will dig some more, but if noone else posts otherwise, we can assume that you are supposed to either fully process the relocations or try the GOT trick that the linux kernel decompressor uses. In both cases, I can see how this may not be what you originally aimed for with PIC.
Here is what I've got with O2:
Code: Select all
# /usr/local/i686-elf/bin/i686-elf-objdump -R test-i686
test-i686: file format elf32-i386
DYNAMIC RELOCATION RECORDS
OFFSET TYPE VALUE
00001280 R_386_RELATIVE *ABS*
# /usr/local/i686-elf/bin/i686-elf-objdump -h test-i686
test-i686: file format elf32-i386
Sections:
Idx Name Size VMA LMA File off Algn
0 .interp 00000013 000000d4 000000d4 000000d4 2**0
CONTENTS, ALLOC, LOAD, READONLY, DATA
1 .hash 0000000c 000000e8 000000e8 000000e8 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
2 .dynsym 00000010 000000f4 000000f4 000000f4 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
3 .dynstr 00000001 00000104 00000104 00000104 2**0
CONTENTS, ALLOC, LOAD, READONLY, DATA
4 .rel.dyn 00000008 00000108 00000108 00000108 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
5 .text 0000006b 00000110 00000110 00000110 2**4
CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
6 .eh_frame 00000084 0000017c 0000017c 0000017c 2**2
CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
7 .dynamic 00000080 00001200 00001200 00000200 2**2
CONTENTS, ALLOC, LOAD, DATA
8 .got 00000004 00001280 00001280 00000280 2**2
CONTENTS, ALLOC, LOAD, DATA
9 .got.plt 0000000c 00001284 00001284 00000284 2**2
CONTENTS, ALLOC, LOAD, DATA
10 .data 00000000 00001290 00001290 00000290 2**0
CONTENTS, ALLOC, LOAD, DATA
11 .bss 0000000c 00001290 00001290 00000290 2**2
ALLOC
12 .comment 00000011 00000000 00000000 00000290 2**0
CONTENTS, READONLY
Re: triple fault when use data from bss
I couldn't find any option for generating position independent code without relocations on i386 (or amd64). There is one for Motorola 68000 (-mpcrel), but none for x86. Generating position independent executable results in a very similar output to a shared library. The elf type is ET_DYN. So, the OP is stuck with either applying offset to the GOT entries or processing the relocations. Unless someone has a different idea.
-
- Member
- Posts: 51
- Joined: Sun Mar 01, 2015 7:58 am
Re: triple fault when use data from bss
Do you have any idea how i can do that?
Re: triple fault when use data from bss
For the easier and dirtier approach, the dynamic relocations are assumed to be all R_386_RELATIVE and pointing into GOT. This condition probably can be maintained, but it would be good if you have something inside your build script to verify it (using grep and such.)
My understanding is that the original GOT entries will contain the value computed with respect to the the base address in the linker script (which is 0 by default for PIE, but not in your script IIRC). R_386_RELATIVE means that the displacement of the new base address from the old base address must be added to each one of them (assuming there is R_386_RELATIVE for each one.) The linux kernel decompressor defines two symbols, _got and _egot, and iterates word-wise between them, adding this offset. A rather trivial affair.
Here is the actual code.
Edit: I was not very clear - to check if all relocations are R_386_RELATIVE, you can do for example:and compare the result in your build script, etc.
My understanding is that the original GOT entries will contain the value computed with respect to the the base address in the linker script (which is 0 by default for PIE, but not in your script IIRC). R_386_RELATIVE means that the displacement of the new base address from the old base address must be added to each one of them (assuming there is R_386_RELATIVE for each one.) The linux kernel decompressor defines two symbols, _got and _egot, and iterates word-wise between them, adding this offset. A rather trivial affair.
Here is the actual code.
Edit: I was not very clear - to check if all relocations are R_386_RELATIVE, you can do for example:
Code: Select all
objdump -R kernel | grep " R_" | cut -d' ' -f2 | sort -u
-
- Member
- Posts: 51
- Joined: Sun Mar 01, 2015 7:58 am
Re: triple fault when use data from bss
I managed to solve the problem. Thank you very much!
Re: triple fault when use data from bss
It's usually a good idea to give some description of what the solution was, so the next guy also benefits =)
-
- Member
- Posts: 51
- Joined: Sun Mar 01, 2015 7:58 am
Re: triple fault when use data from bss
For each entry from GOT, I added an offset of 0xBFF00000 bytes, because my kernel is loaded at 0x00100000 physical address but it's mapped at 0xC0000000 virtual address.
Commit: https://github.com/EnachescuAlin/PhiOS/ ... 4655e93455
Commit: https://github.com/EnachescuAlin/PhiOS/ ... 4655e93455