triple fault when use data from bss

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
TheLittleWho
Member
Member
Posts: 51
Joined: Sun Mar 01, 2015 7:58 am

Re: triple fault when use data from bss

Post by TheLittleWho »

XenOS wrote:First of all, your KERNEL_VMA should rather be 0xC0000000, not 0x100000. Further, to have your sections at a different virtual and physical location, you will need to give the physical address using the AT directive. See this linker script for an example:
http://wiki.osdev.org/Higher_Half_x86_B ... #linker.ld
I don't want this way, i want my kernel to be independent position.
TheLittleWho
Member
Member
Posts: 51
Joined: Sun Mar 01, 2015 7:58 am

Re: triple fault when use data from bss

Post by TheLittleWho »

simeonz wrote:
TheLittleWho wrote:So, bss is mapped virtual, my problem is that when I access the global variable from bss, it's accessed physical address instead of virtual address...
Apparently, you have to pass "-pic" or "-pie" option to the final ld. I am confused by this, actually.

The linker can bypass GOT when the output is ET_EXEC. The original relocation, which is R_386_GOT32, is semantically converted to R_386_GOTOFF. Then it is preapplied and elided, which is possible, because the relative virtual layout is fixed in ET_EXEC and the offset of any variable from GOT is constant. Additionally a mov opcode in the relocated instruction has to be converted converted to lea opcode during link time. This is where the lea instruction in the assembly comes from. You can check the intermediate object file and see that it is still a mov instruction before the optimization.

The problem in your case is that the optimization is used only for .data, whilst for .bss the GOT entries are computed by the linker in terms of the fixed virtual addresses, thus causing the .bss variables to use GOT, but with fixed entries (and no corresponding R_386_GLOB_DAT relocations as with "-shared"). As a final resort you could have used "-q" to keep the relocations and processed them before switching paging on. Anyway, passing "-pic" to ld suddenly triggers GOT elision for both .data and .bss.

I also couldn't reproduce the relative addressing technique from your code. That is, with pic, normally the get_pc_thunk function is used to retrieve the current eip, because the return address is not a fixed quantity. I tried calling a static function and expected some kind of call graph analysis to show that it is called from one place, but the compiler still uses get_pc_thunk.
It's working. Thank you very much.
TheLittleWho
Member
Member
Posts: 51
Joined: Sun Mar 01, 2015 7:58 am

Re: triple fault when use data from bss

Post by TheLittleWho »

Unfortunately, I found a new problem. I have the code:

Code: Select all

int x, y;

void print(void *addr)
{
    kprintf("%p\n", addr);
}

void f1()
{
    x = 0;
    print(&x);
}

void f2()
{
    print(&y);
}

void kernel_main()
{
    f1();
    f2();
}
For the y variable will print the physical address, but for x variable will print the virtual address.

Note that the paging is enabled, the kernel is mapped at 0xC0000000, and my kernel should be position independent, but in this case something is wrong. (the context is same as before :D )
simeonz
Member
Member
Posts: 360
Joined: Fri Aug 19, 2016 10:28 pm

Re: triple fault when use data from bss

Post by simeonz »

Couldn't reproduce the problem. I confess, I modified the snippet ever so slightly, like this:

Code: Select all

int x, y;

void print(void *addr)
{
    volatile void *out;
    out = addr;
}

void f1()
{
    x = 0;
    print(&x);
}

void f2()
{
    print(&y);
}

void _start()
{
    f1();
    f2();
}
I built it like this ("-Wl,-q" is irrelevant):

Code: Select all

#gcc -m32 -std=gnu11 -ffreestanding -O2 -Wall -Wextra -fpic -c test.c -o test.o
gcc -m32 -O2 -nostdlib -z max-page-size=0x1000 test.o -Wl,-pic -Wl,-q -o test
And that was the resulting assembly:

Code: Select all

00000220 <print>:
 220:   f3 c3                   repz ret
 222:   8d b4 26 00 00 00 00    lea    0x0(%esi,%eiz,1),%esi
 229:   8d bc 27 00 00 00 00    lea    0x0(%edi,%eiz,1),%edi

00000230 <f1>:
 230:   53                      push   %ebx
 231:   e8 78 00 00 00          call   2ae <__x86.get_pc_thunk.bx>
 236:   81 c3 ca 1d 00 00       add    $0x1dca,%ebx
 23c:   83 ec 18                sub    $0x18,%esp
 23f:   8d 83 0c 00 00 00       lea    0xc(%ebx),%eax
 245:   c7 00 00 00 00 00       movl   $0x0,(%eax)
 24b:   89 04 24                mov    %eax,(%esp)
 24e:   e8 cd ff ff ff          call   220 <print>
 253:   83 c4 18                add    $0x18,%esp
 256:   5b                      pop    %ebx
 257:   c3                      ret
 258:   90                      nop
 259:   8d b4 26 00 00 00 00    lea    0x0(%esi,%eiz,1),%esi

00000260 <f2>:
 260:   53                      push   %ebx
 261:   e8 48 00 00 00          call   2ae <__x86.get_pc_thunk.bx>
 266:   81 c3 9a 1d 00 00       add    $0x1d9a,%ebx
 26c:   83 ec 18                sub    $0x18,%esp
 26f:   8d 83 10 00 00 00       lea    0x10(%ebx),%eax
 275:   89 04 24                mov    %eax,(%esp)
 278:   e8 a3 ff ff ff          call   220 <print>
 27d:   83 c4 18                add    $0x18,%esp
 280:   5b                      pop    %ebx
 281:   c3                      ret
 282:   8d b4 26 00 00 00 00    lea    0x0(%esi,%eiz,1),%esi
 289:   8d bc 27 00 00 00 00    lea    0x0(%edi,%eiz,1),%edi

00000290 <_start>:
 290:   53                      push   %ebx
 291:   e8 18 00 00 00          call   2ae <__x86.get_pc_thunk.bx>
 296:   81 c3 6a 1d 00 00       add    $0x1d6a,%ebx
 29c:   83 ec 08                sub    $0x8,%esp
 29f:   e8 8c ff ff ff          call   230 <f1>
 2a4:   e8 b7 ff ff ff          call   260 <f2>
 2a9:   83 c4 08                add    $0x8,%esp
 2ac:   5b                      pop    %ebx
 2ad:   c3                      ret

000002ae <__x86.get_pc_thunk.bx>:
 2ae:   8b 1c 24                mov    (%esp),%ebx
 2b1:   c3                      ret
Note the accesses on code address 23f and 26f. They use lea. This basically tells you that the access is completely eip-relative and does not use GOT. So, everything is fine here.

Could you compile the above code with the above gcc commands on your system and check if the accesses use lea on your end. If they do, then please paste the exact command line from your original build (that you use to build your version of the code) and the resulting f1 and f2 disassembly (from your version of the code), to check the results and options side by side. If the gcc options match, but the disassembly differs, then there are differences in gcc that are impacting the result.

Another way to check is to use "objdump -r" (which is possible due to the "-Wl,-q", that you should not otherwise use in your builds). If the result contains R_386_GOT32, then you are toast. If the accesses to x and y use R_386_GOTOFF relocations and there are no R_386_GOT32 relocations, then you should be ok.

My gcc version is 4.8.5, binutils version is 2.23 (.52.0.1). This is on a CentOS vm.
TheLittleWho
Member
Member
Posts: 51
Joined: Sun Mar 01, 2015 7:58 am

Re: triple fault when use data from bss

Post by TheLittleWho »

That is the code:

Code: Select all

001000d0 <print>:
  1000d0:	53                   	push   %ebx
  1000d1:	e8 e2 48 00 00       	call   1049b8 <linker_textEnd>
  1000d6:	81 c3 a2 61 00 00    	add    $0x61a2,%ebx
  1000dc:	83 ec 10             	sub    $0x10,%esp
  1000df:	8d 83 ca f5 ff ff    	lea    -0xa36(%ebx),%eax
  1000e5:	ff 74 24 18          	pushl  0x18(%esp)
  1000e9:	50                   	push   %eax
  1000ea:	e8 21 44 00 00       	call   104510 <kprintf>
  1000ef:	83 c4 18             	add    $0x18,%esp
  1000f2:	5b                   	pop    %ebx
  1000f3:	c3                   	ret    
  1000f4:	8d b6 00 00 00 00    	lea    0x0(%esi),%esi
  1000fa:	8d bf 00 00 00 00    	lea    0x0(%edi),%edi

00100100 <f1>:
  100100:	53                   	push   %ebx
  100101:	e8 b2 48 00 00       	call   1049b8 <linker_textEnd>
  100106:	81 c3 72 61 00 00    	add    $0x6172,%ebx
  10010c:	83 ec 14             	sub    $0x14,%esp
  10010f:	ff b3 e4 ff ff ff    	pushl  -0x1c(%ebx)
  100115:	e8 b6 ff ff ff       	call   1000d0 <print>
  10011a:	83 c4 18             	add    $0x18,%esp
  10011d:	5b                   	pop    %ebx
  10011e:	c3                   	ret    
  10011f:	90                   	nop

00100120 <f2>:
  100120:	53                   	push   %ebx
  100121:	e8 92 48 00 00       	call   1049b8 <linker_textEnd>
  100126:	81 c3 52 61 00 00    	add    $0x6152,%ebx
  10012c:	83 ec 14             	sub    $0x14,%esp
  10012f:	8d 83 08 7b 00 00    	lea    0x7b08(%ebx),%eax
  100135:	c7 00 00 00 00 00    	movl   $0x0,(%eax)
  10013b:	50                   	push   %eax
  10013c:	e8 8f ff ff ff       	call   1000d0 <print>
  100141:	83 c4 18             	add    $0x18,%esp
  100144:	5b                   	pop    %ebx
  100145:	c3                   	ret    
  100146:	8d 76 00             	lea    0x0(%esi),%esi
  100149:	8d bc 27 00 00 00 00 	lea    0x0(%edi,%eiz,1),%edi

00100150 <kernel_main>:
  100150:	53                   	push   %ebx
  100151:	e8 62 48 00 00       	call   1049b8 <linker_textEnd>
  100156:	81 c3 22 61 00 00    	add    $0x6122,%ebx
  10015c:	83 ec 08             	sub    $0x8,%esp
  10015f:	e8 9c ff ff ff       	call   100100 <f1>
  100164:	e8 b7 ff ff ff       	call   100120 <f2>
  100169:	83 c4 08             	add    $0x8,%esp
  10016c:	5b                   	pop    %ebx
  10016d:	c3                   	ret    
  10016e:	66 90                	xchg   %ax,%ax
gcc flags: -std=gnu11 -ffreestanding -O2 -Wall -Wextra -fpic -m32
ld flags: -O2 -nostdlib -z max-page-size=0x1000 -pic

I'm using gcc and ld from here (http://wiki.osdev.org/GCC_Cross-Compiler) : i686-elf-gcc and i686-elf-ld

I can't use -Wl or -m32 for ld...
simeonz
Member
Member
Posts: 360
Joined: Fri Aug 19, 2016 10:28 pm

Re: triple fault when use data from bss

Post by simeonz »

All right. I built gcc and binutils as described in the wiki, just to be sure that we are on the same page, although it is just a more bare bones install (that lacks certain libraries) with particular target architecture enabled.

The really important difference are the versions - I used gcc 7.1.0 and binutils 2.29. Apparently changes were made. The behavior is as it was always advertised to be, but some compiler and linker options have stricter effect. The "-fpic" and "-pic" options generate PIC code suitable for shared libraries only, whereas "-fpie" and "-pie" are for executables. It didn't use to matter so much on the older versions, but makes sense, since shared libraries have relocated GOT entries, whilst executables do not, and the compiler have to generate appropriate code.

Anyway. You need to change "-fpic" to "-fpie" and "-pic" to "-pie".

I also wanted to mention, that as an alternative, the linux kernel decompressor relocates itself by adding the displacement offset to the GOT entries. That is, without full processing of relocation information, this quick and dirty approach is supposed to fix the GOT by just adding a constant to every word there. The proper options above should eliminate the need to do that, as well as GOT usage in general, but if someday you get in a real fix you may try that technique.
TheLittleWho
Member
Member
Posts: 51
Joined: Sun Mar 01, 2015 7:58 am

Re: triple fault when use data from bss

Post by TheLittleWho »

Unfortunately, that not working...but if I don't use O2 flag, it's working. I will do update for gcc and binutils and I hope that everything will be ok.
simeonz
Member
Member
Posts: 360
Joined: Fri Aug 19, 2016 10:28 pm

Re: triple fault when use data from bss

Post by simeonz »

Sorry, don't bother. I have to admit that I temporarily disabled the optimizations to avoid inlining and forgot to enable them after the problem got fixed (because it manifested without optimizations as well.) You may have built another gcc for nothing.

Looking at the -O2 output (after using noinline, etc.), I can see that the executable is generated correctly, but expects dynamic relocation processing, which you don't have. In particular, there is one dynamic relocation pointing into GOT. It should be theoretically possible to generate position independent code without fixups, but I don't know how to coerce the compiler.

I will dig some more, but if noone else posts otherwise, we can assume that you are supposed to either fully process the relocations or try the GOT trick that the linux kernel decompressor uses. In both cases, I can see how this may not be what you originally aimed for with PIC.

Here is what I've got with O2:

Code: Select all

# /usr/local/i686-elf/bin/i686-elf-objdump -R test-i686

test-i686:     file format elf32-i386

DYNAMIC RELOCATION RECORDS
OFFSET   TYPE              VALUE
00001280 R_386_RELATIVE    *ABS*


# /usr/local/i686-elf/bin/i686-elf-objdump -h test-i686

test-i686:     file format elf32-i386

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .interp       00000013  000000d4  000000d4  000000d4  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  1 .hash         0000000c  000000e8  000000e8  000000e8  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  2 .dynsym       00000010  000000f4  000000f4  000000f4  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  3 .dynstr       00000001  00000104  00000104  00000104  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  4 .rel.dyn      00000008  00000108  00000108  00000108  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  5 .text         0000006b  00000110  00000110  00000110  2**4
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
  6 .eh_frame     00000084  0000017c  0000017c  0000017c  2**2
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
  7 .dynamic      00000080  00001200  00001200  00000200  2**2
                  CONTENTS, ALLOC, LOAD, DATA
  8 .got          00000004  00001280  00001280  00000280  2**2
                  CONTENTS, ALLOC, LOAD, DATA
  9 .got.plt      0000000c  00001284  00001284  00000284  2**2
                  CONTENTS, ALLOC, LOAD, DATA
 10 .data         00000000  00001290  00001290  00000290  2**0
                  CONTENTS, ALLOC, LOAD, DATA
 11 .bss          0000000c  00001290  00001290  00000290  2**2
                  ALLOC
 12 .comment      00000011  00000000  00000000  00000290  2**0
                  CONTENTS, READONLY
P.S.: If you are wondering what the goal of PIC on Linux is, it is to allow relocation fixup entirely outside the code segment. This enables sharing the code pages between multiple processes. However, apparently creating relocation-free code is either not a priority, or is produced with another set of options that I am not aware of.
simeonz
Member
Member
Posts: 360
Joined: Fri Aug 19, 2016 10:28 pm

Re: triple fault when use data from bss

Post by simeonz »

I couldn't find any option for generating position independent code without relocations on i386 (or amd64). There is one for Motorola 68000 (-mpcrel), but none for x86. Generating position independent executable results in a very similar output to a shared library. The elf type is ET_DYN. So, the OP is stuck with either applying offset to the GOT entries or processing the relocations. Unless someone has a different idea.
TheLittleWho
Member
Member
Posts: 51
Joined: Sun Mar 01, 2015 7:58 am

Re: triple fault when use data from bss

Post by TheLittleWho »

Do you have any idea how i can do that?
simeonz
Member
Member
Posts: 360
Joined: Fri Aug 19, 2016 10:28 pm

Re: triple fault when use data from bss

Post by simeonz »

For the easier and dirtier approach, the dynamic relocations are assumed to be all R_386_RELATIVE and pointing into GOT. This condition probably can be maintained, but it would be good if you have something inside your build script to verify it (using grep and such.)

My understanding is that the original GOT entries will contain the value computed with respect to the the base address in the linker script (which is 0 by default for PIE, but not in your script IIRC). R_386_RELATIVE means that the displacement of the new base address from the old base address must be added to each one of them (assuming there is R_386_RELATIVE for each one.) The linux kernel decompressor defines two symbols, _got and _egot, and iterates word-wise between them, adding this offset. A rather trivial affair.

Here is the actual code.

Edit: I was not very clear - to check if all relocations are R_386_RELATIVE, you can do for example:

Code: Select all

objdump -R kernel | grep " R_" | cut -d' ' -f2 | sort -u
and compare the result in your build script, etc.
TheLittleWho
Member
Member
Posts: 51
Joined: Sun Mar 01, 2015 7:58 am

Re: triple fault when use data from bss

Post by TheLittleWho »

I managed to solve the problem. Thank you very much!
LtG
Member
Member
Posts: 384
Joined: Thu Aug 13, 2015 4:57 pm

Re: triple fault when use data from bss

Post by LtG »

It's usually a good idea to give some description of what the solution was, so the next guy also benefits =)
TheLittleWho
Member
Member
Posts: 51
Joined: Sun Mar 01, 2015 7:58 am

Re: triple fault when use data from bss

Post by TheLittleWho »

For each entry from GOT, I added an offset of 0xBFF00000 bytes, because my kernel is loaded at 0x00100000 physical address but it's mapped at 0xC0000000 virtual address.
Commit: https://github.com/EnachescuAlin/PhiOS/ ... 4655e93455
Post Reply