Page 1 of 2

binary format vs elf64 format

Posted: Thu Nov 24, 2016 12:16 pm
by citop
I have a problem when changing kernel from binary format to elf64 format. The kernel itself can run normally, but apps can not run any more when changing to elf64 format. To debug the issue, I have tried to manually write the app codes to the exact address the app used to be loaded into, but it simply failed to call the address from os kernel(machine hangs up). And if I try the same way with the binary format kernel, everything works fine as expected. I can not see subtle differences between binary and elf format as to the runtime layout. Anyone can point out the possible causes for this problem?

Re: binary format vs elf64 format

Posted: Fri Nov 25, 2016 1:08 am
by Boris
Was your kernel built with mcmodel kernel ?
Check your OS loader.
Check your app loader..
Check the red zone was disabled ! ( especially in libgcc if you use gcc)
How do you do system calls ? ints ?
Try embedding an small app in the kernel ( in a special section ) and run it

Re: binary format vs elf64 format

Posted: Fri Nov 25, 2016 3:42 pm
by citop
Boris wrote:Was your kernel built with mcmodel kernel ?
Check your OS loader.
Check your app loader..
Check the red zone was disabled ! ( especially in libgcc if you use gcc)
How do you do system calls ? ints ?
Try embedding an small app in the kernel ( in a special section ) and run it
> Was your kernel built with mcmodel kernel ?
no
>Check your OS loader.
>Check your app loader.
os loader is ok to load elf kernel and non-elf kernel, both can run normally, except that elf kernel has problem to run app.
i have tried to bypass app loader. pls see below
> How do you do system calls ? ints ?
actually there is no syscalls, since there is no user space at all, everything is running in kernel mode, so syscall is just implemented using the call instruction
> Try embedding an small app in the kernel ( in a special section ) and run it
below is the code i tried, still have the issue.

vv_dest_addr equ 0x0000000000800000
vv_exec:
mov rsi, vv_machine_code
mov rdi, vv_dest_addr
mov rcx, 8
rep movsq
call vv_dest_addr ; will hang up here with elf kernel, ok with non-elf version
mov rsi, vv_success_msg
call kernel_print_msg
jmp kernel_command_prompt
vv_success_msg dw 'success', 13, 0
vv_machine_code dw 0x486f, .... ; compiled code by nasm, source: mov rax 0x12345
times 64 - ($ - vv_machine_code) db 0

Re: binary format vs elf64 format

Posted: Fri Nov 25, 2016 3:47 pm
by citop
> Check the red zone was disabled ! ( especially in libgcc if you use gcc)

I am not using a c compiler, all written in asm code, so there should not be the red zone issue. right?

Re: binary format vs elf64 format

Posted: Fri Nov 25, 2016 5:00 pm
by Schol-R-LEA
The 'application' as shown is performing a single operation, and does not return; therefore, you would expect it to hang up, and the question then becomes, why didn't it do so before?

OK, sarcasm aside, I am assuming that you simply elided the remaining code (since this opcode isn't the full 8 quadwords you are copying, and you are simply filling the rest with nulls), but it would help if you could show the entirety of it.

EDIT: The comments below were from before I had fully understood the given code, and were based on the assumption that you were loading an actual application file. I have retained them for the sake of clarity for anyone who had seen the original post before I noticed my error, but they aren't particularly relevant to the actual problem.

While you are moving the application code to the location which you are jumping to, I don't see anywhere where the code is loaded into memory from a file or I/O stream. How is it being loaded into memory in the first place?

I am assuming you mean that the kernel is in an ELF64 format file, rather than the app in question. What format is the app file itself in? I am assuming raw binary, but it would help to know just what the loader is operating on.

If you could post the code for the loader - or better still, pass us a link to your offsite repo, if you have one (and we strongly recommend that you do) - it might give us a better idea of just what it is doing.

Also, how are you exporting the labels for the kernel services to the application? Unless you are linking the 'application' into the kernel itself, you would need some means of passing the addresses of the kernel routines to it (which is one of the reasons why system calls and interrupts are usually used even in kernel mode and real mode OSes - the applications don't need to know where the OS services are located in memory, they only need the service handles or interrupt vectors for them).

As an aside, could you please let us know why you are choosing to run applications in kernel mode, and specifically, whether this is only as a temporary stage during development or a part of your final design. I don't know if it has any bearing, but it may.

Re: binary format vs elf64 format

Posted: Fri Nov 25, 2016 8:22 pm
by citop
the app is simple enough, it only contains 2 instructions:

Code: Select all

[BITS 64]
[ORG 0x0000000000800000]

        mov rax, 0x12345678
        ret      ; exit and return to os
and this is the part of loader that does elf64 loading

Code: Select all

        call os_print_msg

; elf64 loader
        cmp dword [KERNEL_PAYLOAD], 0x464c457f     ; elf magic number
        jne raw_binary_loader
        cmp byte [KERNEL_PAYLOAD+4], 2
        jne raw_binary_loader                      ; only support elf64

        mov rdx, [KERNEL_PAYLOAD+0x20]
        mov rax, KERNEL_PAYLOAD
        add rax, rdx
        xor rdx, rdx
        mov dx,  [KERNEL_PAYLOAD+0x36]
        xor rcx, rcx
        mov cx,  [KERNEL_PAYLOAD+0x38]
elf_loop:
        cmp dword [rax], 1
        jne elf_continue
        mov rsi, [rax+8]
        add rsi, KERNEL_PAYLOAD
        mov rdi, [rax+0x10]
        mov r8, rcx
        mov rcx, [rax+0x20]
        add rcx, 7
        shr rcx, 3
        rep movsq
        mov rcx, r8
elf_continue:
        add rax, rdx
        loop elf_loop
        mov rbx, [KERNEL_PAYLOAD+0x18]
        jmp elf_start

raw_binary_loader:
        ; raw binary load continues here
        ...

Re: binary format vs elf64 format

Posted: Fri Nov 25, 2016 8:30 pm
by citop
As an aside, could you please let us know why you are choosing to run applications in kernel mode, and specifically, whether this is only as a temporary stage during development or a part of your final design.
the reason is for simplicity purpose. it is a temporary stage during development.

Re: binary format vs elf64 format

Posted: Fri Nov 25, 2016 8:39 pm
by citop
and below is the kernel code that test to run app

Code: Select all

vv_dest_addr equ 0x0000000000800000
vv_exec:
        mov rsi, vv_machine_code
        mov rdi, vv_dest_addr
        mov rcx, 8
        rep movsq
        call vv_dest_addr ; will hang up here with elf64 kernel, ok with non-elf version
        mov rsi, vv_success_msg
        call kernel_print_msg
        jmp kernel_command_prompt
vv_success_msg  dw 'success', 13, 0
vv_machine_code dw 0x78b8,0x3456,0xc312
        times 64 - ($ - vv_machine_code) db 0

Re: binary format vs elf64 format

Posted: Fri Nov 25, 2016 10:48 pm
by citop
just added some debug output. dump registers before call to the app code, and print sucess/error message when returning from the app

Re: binary format vs elf64 format

Posted: Sat Nov 26, 2016 6:14 am
by Schol-R-LEA
I'm not sure if this would help or not, but you might find it useful to declare a STRUC to hold the offsets for the ELF header and similar structures. I don't think it is related to the current problem, but it may help you avoid other problems in the future.

Code: Select all

;;  NASM structure definition for ELF file header

STRUC Elf64_Header
        .magic:                 resd 1
        .bits:                  resb 1
        .direction:             resb 1
        .version:               resb 1
        .os_abi:                resb 1
        .reserved:              resb 8
        .type:                  resw 1
        .arch:                  resw 1
        .ver_ext:               resd 1
        .text_entry:            resq 1
        .text_header:           resq 1
        .section_header:        resq 1
        .flags:                 resd 1
        .header_size:           resw 1
        .text_entry_size:       resw 1
        .text_entry_count:      resw 1
        .section_header_size:   resw 1
        .section_entry_size:    resw 1
        .section_entry_count:   resw 1
        .section_name_index:    resw 1
ENDSTRUC

Code: Select all

;;  NASM structure definition for ELF program section header
STRUC ELF64_Text_Header
        .segment_type:          resd 1
        .flags:                 resd 1
        .text_file_offset:      resq 1
        .text_mem_offset:       resq 1
        .reserved:              resq 1
        .text_seg_file_size:    resq 1
        .text_seg_mem_size:     resq 1
        .text_align:            resd 1
ENDSTRUC
While a NASM structure is just a way of declaring a series of offsets - it doesn't support the dot-notation which MASM and TASM use, so you still need to explicitly add the offsets to the structure base - it should at least make the purpose of the offsets clearer:

Code: Select all

; elf64 loader
        cmp dword [KERNEL_PAYLOAD+ELF64_Header.magic], 0x464c457f     ; elf magic number
        jne raw_binary_loader
        cmp byte [KERNEL_PAYLOAD+ELF64_Header.bits], 2
        jne raw_binary_loader                      ; only support elf64

        ; set the following:
        ; rax == program header address  
        ; dx == program entry size
        ; cx == # of program sections 
        mov rdx, [KERNEL_PAYLOAD+ELF64_Header.text_header_offset]
        mov rax, KERNEL_PAYLOAD
        add rax, rdx
        xor rdx, rdx
        mov dx,  [KERNEL_PAYLOAD+ELF64_Header.text_entry_size]
        xor rcx, rcx
        mov cx,  [KERNEL_PAYLOAD+ELF64_Header.text_entry_count]
elf_loop:
        cmp dword [rax+ELF64_Text_Header.segment_type], 1
        jne elf_continue
        mov rsi, [rax+ELF64_Text_Header.text_file_offset]
        add rsi, KERNEL_PAYLOAD
        mov rdi, [rax+ELF64_Text_Header.text_mem_offset]
        mov r8, rcx
        mov rcx, [rax+ELF64_Text_Header.segment_file_size]
        add rcx, 7
        shr rcx, 3
        rep movsq
        mov rcx, r8
elf_continue:
        add rax, rdx
        loop elf_loop
        mov rbx, [KERNEL_PAYLOAD+ELF64_Header.text_entry]
        jmp elf_start

Re: binary format vs elf64 format

Posted: Sat Nov 26, 2016 7:28 am
by MichaelFarthing
Probably not the problem, but shouldn't you do a cld before the movs?

Re: binary format vs elf64 format

Posted: Sat Nov 26, 2016 8:10 am
by Schol-R-LEA
MichaelFarthing wrote:Probably not the problem, but shouldn't you do a cld before the movs?
It is advisable, yeah; assuming that the direction flag is clear or set is generally a poor practice, even if you have no reason to think it will be set/clear. However, since we only have a small part of the code on hand, it may be that the OP did that before this code started. Good advice anyway, though.

Re: binary format vs elf64 format

Posted: Sat Nov 26, 2016 8:35 am
by Schol-R-LEA
BTW, much of this probably isn't necessary; once you have the kernel file in memory, you should be able to execute from where it is loaded once you have located the first .text segment (using the file header and the first .text segment header) and its corresponding .data and .rodata segment. All you would need to do is set the pages holding the executable image to r/o, executable, and you should be able to run from there, assuming that there is only a single executable segment and no relocation.

Even if the first of those two assumptions doesn't hold, it should be possible to arrange for the first segment to contain the code for the GDT, IDT, and page tables (though I am going to guess that the boot loader already took care of those), then load the remaining segment headers and their corresponding segments. Relocation is another story, but since you haven't mentioned anything about it I am going to assume you aren't using any for this (it would be very unusual if you did use any at this point anyway).

For loading actual applications, if you have paging set up, you only need load the file header, use that to locate the program headers and load them into memory, then for each executable program segment use the values in each program header's .text_file_offset and your file system (if you have one in place yet) to find the sector of the memory, then mark that as the paging location for a r/o, executable page's page file. However, that will require quite a bit to be present ahead of time, so you will probably want to do something simpler at first.

Re: binary format vs elf64 format

Posted: Sat Nov 26, 2016 9:49 am
by citop
nice advice, nice code, thank you!

at present i am focusing on the kernel size. there is obviously bugs in the kernel, that is, not able to call/jmp outside of kernel address segment, which is a big obstacle to move forward.

Re: binary format vs elf64 format

Posted: Sat Nov 26, 2016 10:03 am
by iansjack
Single-step the code under a debugger to see what is going wrong.