How to jump to an ELF64 format executable compiled with GCC

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
kdx7214
Member
Member
Posts: 25
Joined: Tue Jun 07, 2011 5:34 pm

How to jump to an ELF64 format executable compiled with GCC

Post by kdx7214 »

So I now have some code booting with grub2. It's a 32-bit assembly stub and some 64-bit C code. I've got the assembly working fine (no IDT yet, but one thing at a time). I can write to video memory just fine and all seems well. I had planned on filling the IDT and setting up interrupts from C but then ran into an error message when attempting to JMP to the entry point of the C code (a function called kmain).

I'm running bochs for CD emulation and get the following in the boot.log file:
00068440871e[CPU0 ] read_virtual_dword_64(): canonical failure
00068440871e[CPU0 ] interrupt(long mode): gate descriptor is not valid sys seg
00068440871e[CPU0 ] interrupt(long mode): gate descriptor is not valid sys seg
00068440871i[CPU0 ] CPU is in long mode (active)

I've read up on the canonical address "feature" that AMD put in but yet don't understand why this happens when I jump to the C code start point.

I've double checked the gcc options specified and think I have them correct (including no red zone and so forth). Any advice?

Thanks!
Mike
kdx7214
Member
Member
Posts: 25
Joined: Tue Jun 07, 2011 5:34 pm

Re: How to jump to an ELF64 format executable compiled with

Post by kdx7214 »

Have gotten down to figuring out that the problem has something to do with access of the stack, but am still digging.
immibis
Posts: 19
Joined: Fri Dec 18, 2009 12:38 am

Re: How to jump to an ELF64 format executable compiled with

Post by immibis »

Post your code for setting up the stack.
kdx7214
Member
Member
Posts: 25
Joined: Tue Jun 07, 2011 5:34 pm

Re: How to jump to an ELF64 format executable compiled with

Post by kdx7214 »

The code that I'm using to setup the stack follows. I'm still digging and am beginning to wonder if it's something in the ABI that I'm not doing - although digging through that document is sort of like going to the dentist - at least for me.

Code: Select all

        mov     ax, 0x10
        mov     ds, ax
        mov     es, ax
        mov     fs, ax
        mov     gs, ax
        mov     rsp, _sys_stack

        mov     edi, 0xb8000
        mov     rax, 0x0341034203430344
        mov     [edi], rax

        xor     rax, rax

        call    kmain
        cli
        hlt
immibis
Posts: 19
Joined: Fri Dec 18, 2009 12:38 am

Re: How to jump to an ELF64 format executable compiled with

Post by immibis »

That looks fine, assuming kmain takes no arguments (I don't know much about the x64 calling convention, though).

Does _sys_stack point to the end of the stack?
kdx7214
Member
Member
Posts: 25
Joined: Tue Jun 07, 2011 5:34 pm

Re: How to jump to an ELF64 format executable compiled with

Post by kdx7214 »

Yes, _sys_stack points to the end of a 16k section of space reserved in the .bss section of the elf file. I have been able to confirm that the stack is working okay in assembly code, but as soon as it calls the C code something is getting lost.

And you are correct in your assumption that kmain takes no parameters. I wanted to keep things simple to start off.

I've been able to use the bochs debugger to trace into the call, at which point the gcc generated code gets a bit strange to look at. I will try to get a snapshot of some of that and post that below as well. I've not yet tried calling an asm routine from 64-bit but will try that next and post results. Right now I suspect the problem has to do with the 64-bit ABI stuff.

Thanks!
Mike

Edit: Just confirmed that a subroutine call in asm (using nasm) in long mode works just fine so stack setup should be okay. More and more I'm thinking it's something in the ABI. Now to try to learn that stupid gas syntax to see if I can make sense of the ABI.
User avatar
xenos
Member
Member
Posts: 1121
Joined: Thu Aug 11, 2005 11:00 pm
Libera.chat IRC: xenos1984
Location: Tartu, Estonia
Contact:

Re: How to jump to an ELF64 format executable compiled with

Post by xenos »

Perhaps you could try to use objdump to figure out the addresses of _sys_stack and kmain. The error message "read_virtual_dword_64(): canonical failure" indicates that you try to access a memory location that does not have the "canonical address form". All virtual addresses on x86_64 are only 48 bits - the uppermost 16 bits of a 64 bit address must all be equal to bit 47 ("sign extension"). That means that addresses such as 0x0000000010000000 or 0xffff800000000000 are perfectly fine, but something like 0x1234000000000000 raises an exception when you try to use it as a virtual memory address.

The ABI is quite well documented here:
http://www.x86-64.org/documentation.html
The most important difference from x86 is that function parameters are usually passed in registers and not on the stack (unless you want to pass large structs). Further, you should have a look at the ABI docs to see what the "red zone" is - I personally prefer to disable it with gcc's -mno-red-zone switch.

Just one more thing: At which address do you link your kernel and which memory model (-mcmodel=...) do you use?
Programmers' Hardware Database // GitHub user: xenos1984; OS project: NOS
kdx7214
Member
Member
Posts: 25
Joined: Tue Jun 07, 2011 5:34 pm

Re: How to jump to an ELF64 format executable compiled with

Post by kdx7214 »

I like the kernel at 0x100000 (2mb mark) with the 32-bit assembly stub being first courtesy of the ld script. I'm using mcmodel=large for the moment as nothing else seemed any better/worse for what I am doing. I've read through the ABI but to be honest, I don't know motorola syntax so it takes me quite a while to look up what some of the demo code does.

I did use objdump as well as the kernel map file produced by gcc to get the addresses and it appears everything is in the 0x100000 - 0x104000 range that I would expect. I can't find any reference to anything inside the no-use canonical zone anyway. That's part of why I'm confused over it. I did NOT reset SS by the way which could be part of the problem.

Here's a screen dump of what the debugger shows when I step through the call kmain line:

Code: Select all

Next at t=71382459
(0) [0x0000000000100196] 0008:0000000000100196 (unk. ctxt): mov es, ax                ; 8ec0
<bochs:5> s
Next at t=71382460
(0) [0x0000000000100198] 0008:0000000000100198 (unk. ctxt): mov fs, ax                ; 8ee0
<bochs:6> s
Next at t=71382461
(0) [0x000000000010019a] 0008:000000000010019a (unk. ctxt): mov gs, ax                ; 8ee8
<bochs:7> s
Next at t=71382462
(0) [0x000000000010019c] 0008:000000000010019c (unk. ctxt): mov rsp, 0x0000000000106000 ; 48bc0060100000000000
<bochs:8> s
Next at t=71382463
(0) [0x00000000001001a6] 0008:00000000001001a6 (unk. ctxt): mov edi, 0x000b8000       ; bf00800b00
<bochs:9> s
Next at t=71382464
(0) [0x00000000001001ab] 0008:00000000001001ab (unk. ctxt): mov rax, 0x0341034203430344 ; 48b84403430342034103
<bochs:10> s
Next at t=71382465
(0) [0x00000000001001b5] 0008:00000000001001b5 (unk. ctxt): mov qword ptr ds:[edi], rax ; 67488907
<bochs:11> s
Next at t=71382466
(0) [0x00000000001001b9] 0008:00000000001001b9 (unk. ctxt): xor rax, rax              ; 4831c0
<bochs:12> s
Next at t=71382467
(0) [0x00000000001001bc] 0008:00000000001001bc (unk. ctxt): call .+111 (0x0000000000100230) ; e86f000000
<bochs:13> s
Next at t=71382468
(0) [0x0000000000100230] 0008:0000000000100230 (unk. ctxt): clc                       ; f8
<bochs:14> s
Next at t=71382469
(0) [0x0000000000100231] 0008:0000000000100231 (unk. ctxt): add eax, dword ptr ds:[rax] ; 0300
<bochs:15> s
Next at t=71382470
(0) [0x0000000000100233] 0008:0000000000100233 (unk. ctxt): add byte ptr ds:[rax], al ; 0000
<bochs:16> s
Next at t=71382471
(0) [0x0000000000100235] 0008:0000000000100235 (unk. ctxt): add byte ptr ds:[rax], al ; 0000
<bochs:17> s
Next at t=71382472
(0) [0x0000000000100237] 0008:0000000000100237 (unk. ctxt): add byte ptr ds:[rcx], al ; 0001
<bochs:18> s
Next at t=71382473
(0) [0x0000000000100239] 0008:0000000000100239 (unk. ctxt): add byte ptr ds:[rax], al ; 0000
<bochs:19> s
Next at t=71382474
(0) [0x000000000010023b] 0008:000000000010023b (unk. ctxt): add byte ptr ds:[rcx], cl ; 0009
<bochs:20> s
Next at t=71382475
(0) [0x000000000010023d] 0008:000000000010023d (unk. ctxt): add byte ptr ds:[rax], al ; 0000
<bochs:21> s
Next at t=71382476
(0) [0x000000000010023f] 0008:000000000010023f (unk. ctxt): add byte ptr ds:[rax], al ; 0000
<bochs:22> s
Next at t=71382477
(0) [0x0000000000100241] 0008:0000000000100241 (unk. ctxt): fdivr st(7), st(0)        ; dcf7
<bochs:23> s
Next at t=71382478
(0) [0x0000000000100243] 0008:0000000000100243 (unk. ctxt): add edi, dword ptr ds:[rdx+33570419] ; 03ba733e0002
<bochs:24> s
(0).[71382478] [0x0000000000100243] 0008:0000000000100243 (unk. ctxt): add edi, dword ptr ds:[rdx+33570419] ; 03ba733e0002
Next at t=71382479
(0) [0x00000000fffffff0] f000:fff0 (unk. ctxt): jmp far f000:e05b         ; ea5be000f0
<bochs:25>
As you can see it starts where I'm just setting the segment registers to the appropriate selector. The call +111 is the call to kmain. I have NO idea where the line that is causing the fault is coming from. My kmain is quite literally:

Code: Select all

void kmain(void) {

}
So there should only be the standard function setup/destruction.

Thanks!
Mike
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: How to jump to an ELF64 format executable compiled with

Post by Combuster »

your kmain is not where the call thinks it is. By the looks of it you are executing something in the data section, with all the consequences thereof. The first instruction after the call should be either a ret or a push ebp/rbp depending on optimisations.

What are the exact command line arguments you are using? What is your linker script? What does objdump say about section and symbol locations?
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
kdx7214
Member
Member
Posts: 25
Joined: Tue Jun 07, 2011 5:34 pm

Re: How to jump to an ELF64 format executable compiled with

Post by kdx7214 »

Okay, for completeness I'm posting the entire build script (short) and linker file that I'm using.

The build shell script:

Code: Select all

[ -f kernel.elf ] && rm kernel.elf
[ -f start.o ]    && rm start.o
[ -f kmain.o ]    && rm kmain.o

gcc -m64 -ffreestanding -nostdlib -mcmodel=large -mno-red-zone -mno-mmx -mno-sse -mno-sse2 -mno-sse3 -mno-3dnow -c -o kmain.o kmain.c

nasm -f elf64 -o start.o start.asm

ld -nostdlib -nodefaultlibs -z max-page-size=0x1000 -T kernel.ld --cref -Map kernel.map -o kernel.elf start.o kmain.o

[ -f kernel.elf ] && cp kernel.elf ~/t/iso/boot

The linker script:

Code: Select all

OUTPUT_FORMAT(elf64-x86-64)
ENTRY(start)
KERNEL_VMA = 0x00100000 ;
SECTIONS
{
    . = KERNEL_VMA;

    .mboot : ALIGN(8)
    {
       LONG(0xE85250D6) ;               /* Magic number for grub2 */
       LONG(0x00000000) ;               /* i386 architecture */
       LONG(0x00000018) ;               /* Header length = 24 bytes */
       LONG(0x17ADAF12) ;               /* checksum */
       LONG(0x00000000) ;               /* tag_end */
       LONG(0x00000008) ;               /* size of this tag */
    }

    .bootstrap : ALIGN(8)
    {
        start.o (.text)
    }

    .text : AT(ADDR(.text) - KERNEL_VMA)
    {
        _code = .;
        *(EXCLUDE_FILE(*start.o) .text)
        *(.rodata*)
        . = ALIGN(4096);
    }

   .data : AT(ADDR(.data) - KERNEL_VMA)
   {
        _data = .;
        *(.data)
        . = ALIGN(4096);
   }

   .ehframe : AT(ADDR(.ehframe) - KERNEL_VMA)
   {
       _ehframe = .;
       *(.ehframe)
        . = ALIGN(4096);
   }

   .bss : AT(ADDR(.bss) - KERNEL_VMA)
   {
       _bss = .;
       *(.bss)
        *(COMMON)
       . = ALIGN(4096);
   }

   _end = .;

   /DISCARD/ :
   {
        *(.comment)
   }
}
According to the kernel.map file that is generated (shown below) kmain is at the expected location:

Code: Select all

Allocating common symbols
Common symbol       size              file

grub2_meminfo       0x8               kmain.o

Discarded input sections

 .comment       0x0000000000000000       0x2c kmain.o

Memory Configuration

Name             Origin             Length             Attributes
*default*        0x0000000000000000 0xffffffffffffffff

Linker script and memory map

                0x0000000000100000                KERNEL_VMA = 0x100000
                0x0000000000100000                . = KERNEL_VMA

.mboot          0x0000000000100000       0x18
                0x0000000000100000        0x4 LONG 0xe85250d6
                0x0000000000100004        0x4 LONG 0x0
                0x0000000000100008        0x4 LONG 0x18
                0x000000000010000c        0x4 LONG 0x17adaf12
                0x0000000000100010        0x4 LONG 0x0
                0x0000000000100014        0x4 LONG 0x8

.bootstrap      0x0000000000100020      0x20e
 start.o(.text)
 .text          0x0000000000100020      0x20e start.o
                0x0000000000100020                start
                0x0000000000100190                cont

.text           0x0000000000100230      0xdd0 load address 0x0000000000000230
                0x0000000000100230                _code = .
 *(EXCLUDE_FILE(*start.o) .text)
 .text          0x0000000000100230        0x6 kmain.o
                0x0000000000100230                kmain
 *(.rodata*)
                0x0000000000101000                . = ALIGN (0x1000)
 *fill*         0x0000000000100236      0xdca 00

.eh_frame       0x0000000000101000       0x38 load address 0x0000000000001000
 .eh_frame      0x0000000000101000       0x38 kmain.o

.iplt           0x0000000000101038        0x0 load address 0x0000000000001038
 .iplt          0x0000000000000000        0x0 start.o

.rela.dyn       0x0000000000101038        0x0 load address 0x0000000000001038
 .rela.iplt     0x0000000000000000        0x0 start.o
 .rela.text     0x0000000000000000        0x0 start.o

.data           0x0000000000101038      0xfc8 load address 0x0000000000001038
                0x0000000000101038                _data = .
 *(.data)
 .data          0x0000000000101038        0x0 kmain.o
                0x0000000000102000                . = ALIGN (0x1000)
 *fill*         0x0000000000101038      0xfc8 00

.igot.plt       0x0000000000102000        0x0 load address 0x0000000000002000
 .igot.plt      0x0000000000000000        0x0 start.o
.ehframe        0x0000000000102000        0x0 load address 0x0000000000002000
                0x0000000000102000                _ehframe = .
 *(.ehframe)
                0x0000000000102000                . = ALIGN (0x1000)

.bss            0x0000000000102000     0x5000 load address 0x0000000000002000
                0x0000000000102000                _bss = .
 *(.bss)
 .bss           0x0000000000102000     0x4000 start.o
 .bss           0x0000000000106000        0x0 kmain.o
 *(COMMON)
 COMMON         0x0000000000106000        0x8 kmain.o
                0x0000000000106000                grub2_meminfo
                0x0000000000107000                . = ALIGN (0x1000)
 *fill*         0x0000000000106008      0xff8 00
                0x0000000000107000                _end = .

/DISCARD/
 *(.comment)
LOAD start.o
LOAD kmain.o
OUTPUT(kernel.elf elf64-x86-64)

.note.GNU-stack
                0x0000000000000000        0x0
 .note.GNU-stack
                0x0000000000000000        0x0 kmain.o

Cross Reference Table

Symbol                                            File
cont                                              start.o
grub2_meminfo                                     kmain.o
                                                  start.o
kmain                                             kmain.o
                                                  start.o
start                                             start.o
                                                                                                                                                                                                                            100,1         Bot
User avatar
xenos
Member
Member
Posts: 1121
Joined: Thu Aug 11, 2005 11:00 pm
Libera.chat IRC: xenos1984
Location: Tartu, Estonia
Contact:

Re: How to jump to an ELF64 format executable compiled with

Post by xenos »

The linker script and the map file look fine to me... Could you try the following command?

Code: Select all

objdump -x -t -s -d kernel.elf
This should give you a lot of output (so you better dump it to a file or pipe it into "less"), including a disassembly of your code. Then you can check whether the code at address 0x100230 (where kmain should be) really contains what it should. The debugger output suggests that this address (and the following) contain just garbage - so either kernel.elf is somehow screwed up, or it is not loaded properly, or its contents get overwritten by whatever happens before calling kmain.
Programmers' Hardware Database // GitHub user: xenos1984; OS project: NOS
kdx7214
Member
Member
Posts: 25
Joined: Tue Jun 07, 2011 5:34 pm

Re: How to jump to an ELF64 format executable compiled with

Post by kdx7214 »

Okay, here's the output of the objdump:

Code: Select all

0000000000100190 <cont>:
  100190:       66 b8 10 00             mov    $0x10,%ax
  100194:       8e d8                   mov    %eax,%ds
  100196:       8e c0                   mov    %eax,%es
  100198:       8e e0                   mov    %eax,%fs
  10019a:       8e e8                   mov    %eax,%gs
  10019c:       48 bc 00 60 10 00 00    movabs $0x106000,%rsp
  1001a3:       00 00 00
  1001a6:       bf 00 80 0b 00          mov    $0xb8000,%edi
  1001ab:       48 b8 44 03 43 03 42    movabs $0x341034203430344,%rax
  1001b2:       03 41 03
  1001b5:       67 48 89 07             mov    %rax,(%edi)
  1001b9:       48 31 c0                xor    %rax,%rax
  1001bc:       e8 6f 00 00 00          callq  100230 <kmain>
  1001c1:       fa                      cli
  1001c2:       f4                      hlt

Disassembly of section .text:

0000000000100230 <kmain>:
  100230:       55                      push   %rbp
  100231:       48 89 e5                mov    %rsp,%rbp
  100234:       c9                      leaveq
  100235:       c3                      retq
        ...
From all I can tell things look like they're in the right spot with kmain being at 0x100230 and the disassembly looks fine.

Thanks!
Mike


EDIT: Okay, just did one last test to make sure I'm not going crazy. I did a "mov rax, kmain" followed by a hlt. In the debugger I inspected the value of the rax register at the halt and noticed that it's off by 8-bytes. Instead of the 0x100230 that I expected it's getting 0x100238 but all of the output shows that it should be 0x100230. I'm confused :)
gerryg400
Member
Member
Posts: 1801
Joined: Thu Mar 25, 2010 11:26 pm
Location: Melbourne, Australia

Re: How to jump to an ELF64 format executable compiled with

Post by gerryg400 »

Code: Select all

EDIT: Okay, just did one last test to make sure I'm not going crazy. I did a "mov rax, kmain" followed by a hlt. In the debugger I inspected the value of the rax register at the halt and noticed that it's off by 8-bytes. Instead of the 0x100230 that I expected it's getting 0x100238 but all of the output shows that it should be 0x100230. I'm confused :)

Maybe inserting the debug code moved kmain by 8 bytes.
If a trainstation is where trains stop, what is a workstation ?
kdx7214
Member
Member
Posts: 25
Joined: Tue Jun 07, 2011 5:34 pm

Re: How to jump to an ELF64 format executable compiled with

Post by kdx7214 »

Yep, that's what did it. Updated my build script to remove the old one first and it's now back to doing the same thing again.

Edit: Interesting thing just found using magic breakpoints. I set a bp at the beginning of the 32-bit bootstrap routine and notice that the address the kernel kmain function resides at contains the same garbage data that is there after the switch to long mode. This seems to indicate that it's either not being loaded or not being loaded at the specific address. Alas, I need someone with far more experience in linker scripts to check my script above to see if it's okay.

Thanks!
Mike
kdx7214
Member
Member
Posts: 25
Joined: Tue Jun 07, 2011 5:34 pm

Re: How to jump to an ELF64 format executable compiled with

Post by kdx7214 »

Solved!

The problem was the linker script. It turns out the " - KERNEL_VMA" was trying to relocate the load address of the kernel like it would for a higher half kernel, which I'm not doing. As a result any code not found in start.asm was being moved to a virtual address that was not a canonical address.

Thanks for all the help folks!
Mike
Post Reply