Page 1 of 1

Linkers and the 'Load Memory Address' (Solved)

Posted: Thu Apr 16, 2015 4:46 pm
by blacky
Hello everyone,

The Higher Half Tutorial relocates the actual kernel code to the virtual address 0xC0000000. I would like to achieve the same but without the need for repeated subtraction of the virtual address within the bootsrap-code. If I understood it correctly, unless explicitly specified, the linker usually assumes the load address of the individual sections to be the same as their virtual memory addresses. To provide the information to the linker that the virtual address differs from the load address one can use the optional macro: AT(lma). (Note: lma stands for load memory address).

I have tried to do just that within my linker script. Although the linker correctly placed the specified section .kern to the virtual address 0xC0000000, it failes to load the actual function within this section annotated as .kerntext to the virtual address. Hence after page table initialization the code tries to jump to a function that has not been loaded by the linker.

In the following, you can find my linker script:

Code: Select all

ENTRY(_start)

SECTIONS
{
    . = 0x00100000;

    .text ALIGN(0x1000) :
    {
        *(.multiboot)
        *(.text)
        *(.gnu.linkonce.t*)
    }

    .rodata ALIGN(0x1000) :
    {
        PROVIDE (_srodata = .);
        
        start_ctors = .;
        *(SORT(.ctors*))
        end_ctors = .;

        start_dtors = .;
        *(SORT(.dtors*))
        end_dtors = .;
        
        *(.rodata*)
        *(.gnu.linkonce.r*)
        
        PROVIDE (_erodata = .);
    }

    .data ALIGN(0x1000) :
    {
        PROVIDE (_sdata = .);
        
        *(.data*)
        *(.gnu.linkonce.d*)
        
        PROVIDE (_edata = .);
    }

    .bss :
    {
        PROVIDE (_sbss = .);
        
        *(COMMON)
        *(.bss)
        *(.bootstrap_stack)
        /* *(.gdt_area)*/
        *(.gnu.linkonce.b*)
        
        PROVIDE (_ebss = .);
    }

    . = 0xC0000000 + 0x100000;

    /* the following section .kern is at the virtual address 0xC0100000 */

    .kern ALIGN(0x1000) :  AT(ADDR(.kern) - (0xC0000000)) { 
        PROVIDE (_skern = .);
        *(.kerntext)
        PROVIDE (_ekern = .);
    }

    /DISCARD/ :
    {
        *(.note)
        *(.comment)
        /* *(.eh_frame) */ /* discard this, unless you are implementing runtime support for C++ exceptions. */
    }
}
It would be great if you could help me with my concern :)
Thank you very much in advance.

Best regards.

Re: Linkers and the 'Load Memory Address'

Posted: Thu Apr 16, 2015 10:20 pm
by KemyLand
Continued from a ITC (Inter-Topic Communication) socket :wink: with http://f.osdev.org/viewtopic.php?f=1&t=29216
blacky wrote:Nice, thank you very much :)
Any additional information is considered as a potential learn-base and is very helpful, as I am a beginner when it comes to OS development ;)

Edit: As I have seen it in your code as well, I have right away a follow-up question concerning linkers and the higher-half kernel.
Since my follow-up question does not concern the initial issue of this thread, the follow-up question can be found here: Linkers and the 'Load Memory Address'.
Well, first of all, let's remember something:
sortie wrote: May I suggest you don't mix multiboot and higher half? It's not pretty when code has to relocate itself. I recommend you use a bootloader (perhaps a custom one) that enables paging when loading your kernel and maps it already to virtual locations. You pass the locations of where the paging tables are mapped and the memory map to the kernel, it can then bootstrap itself and take control of paging. You then simply link the kernel at somewhere location you want it.
Sortie once said that in a thread I posted long ago. I didn't actually followed his signs, as the bootloader work wasn't pretty. I prefered to spend 3 days writting (and Holy Fuckin' debugging) that 300-lines-long bootstrap code. I will follow him once I can afford it, but in the meantime, I (and you) shall use those dirty opcodes.

The code I gave you uses a symbol named KVIRT_BASE, defined as 0xC0000000. It's pretty obvious that you have to...

Code: Select all

# ...
movl $pageTables, %edi
subl $KVIRT_BASE, %edi
# ...
...for each time you reference a variable. That's how my current code works.

But... I've just thought of a new solution. All structures that __start__ reference shall be linked in a section named .bootdata, that contains the Page Directory and Tables. Some strange tricks shall be done, the linker script would be very complex. I did tried to wrote it when posting this post, but it was too hackish. Better stay with substractions :roll: .

Re: Linkers and the 'Load Memory Address'

Posted: Fri Apr 17, 2015 1:58 am
by max
Meeeh.. that sounds dirty. When you're using Multiboot, the proper way is to create two binaries: the loader and the kernel. The loader has the multiboot header is loaded to lower memory by GRUB. It sets up initial stuff & paging and loads the kernel (which is a multiboot module) directly to higher memory. It's really low effort and worth the time, the only additional work is that you have to parse the ELF header of your kernel binary.

Re: Linkers and the 'Load Memory Address'

Posted: Fri Apr 17, 2015 3:27 am
by Combuster
In oldfashioned style, you can easily create a higher-half kernel directly from GRUB. If you use the minimal amount of code needed to enable paging and jump to the higher-half address the net result is more efficient than having a whole additional bootstrap step with their binaries.

It does mean that by default everything have separate VMA (3GB) and LMA (1MB) addresses as that is where you want to actually run your code. You won't be able to safely convince gcc to put everything in a .kern section, but you can put the multiboot header and ~25 lines of paging bootstrap in it's own section and make that the only part to operate at LMA=VMA=1MB. If done properly you never need to add offsets to addresses. This mechanism is roughly what the higher-half tutorial demonstrates, except that it foregoes a separate bootstrap section for only two address conversions. It's certainly something completely different than what your linker script demonstrates to be.

Of course, if you need more than a page or two of bootstrap code because you want to choose between kernel binaries or do significant amounts of other initialisation before configuring the address space then separating the stages is much better for your sanity to separate the bootstrap from the kernel. And even then you'll have a transition stage where both identity and higher-half maps are required at the same time, and you'll have to do explicit address juggling. But if you take care of it properly, you can reduce that to a bare minimum.



But to get a bit of experience with linker scripts, you might want to go deal with the issue that overwrites one section with another for being at the same physical address.

Re: Linkers and the 'Load Memory Address'

Posted: Fri Apr 17, 2015 2:32 pm
by KemyLand
max wrote:Meeeh.. that sounds dirty. When you're using Multiboot, the proper way is to create two binaries: the loader and the kernel. The loader has the multiboot header is loaded to lower memory by GRUB. It sets up initial stuff & paging and loads the kernel (which is a multiboot module) directly to higher memory. It's really low effort and worth the time, the only additional work is that you have to parse the ELF header of your kernel binary.
That's good, if you want to still have that complex and slow bootstrap process for the Ghost Kernel. You said its slowness is because of C++, but I don't think so (**** you, the C-syndromes who don't understand C++ can be faster that the ugly **** (only on big projects) of C), I think it's due to this :| . Let's do benchmarking (how do you do that on a modular, micro-kernel?)

Re: Linkers and the 'Load Memory Address'

Posted: Fri Apr 17, 2015 2:43 pm
by max
KemyLand wrote:That's good, if you want to still have that complex and slow bootstrap process for the Ghost Kernel. You said its slowness is because of C++, but I don't think so (**** you, the C-syndromes who don't understand C++ can be faster that the ugly **** (only on big projects) of C), I think it's due to this :| . Let's do benchmarking (how do you do that on a modular, micro-kernel?)
The bootstrap loader of Ghost only takes a very very short time to load the kernel binary, and then it completely transfers control to the kernel. The performance of Ghost itself is quite good - except for the UI. It's really hard to create a proper window management system that does event handling etc. and keeping it fast, but I've done my best and already made some optimiziations (will be seen in further releases).

Re: Linkers and the 'Load Memory Address'

Posted: Fri Apr 17, 2015 3:08 pm
by blacky
Thank you all for your valued contribution :)

KemyLand:
My current bootstrapping code is a mix of basic C++ and assembler code, which makes it a little bit harder to perform the additional subtractions, which is the reason why I tried to relocate the actual kernel code to the higher half directly within the linker script, separating the bootstrapping from the remaining kernel code. I started with this concept (C++ and asm mix of the bootstrapping code), as I thought it is a good thing to play around :)

The Idea is to jump at some point to the long mode. However until then I wanted to experiment with the 32bit protected mode.

Combuster:
Of course, if you need more than a page or two of bootstrap code because you want to choose between kernel binaries or do significant amounts of other initialisation before configuring the address space then separating the stages is much better for your sanity to separate the bootstrap from the kernel. And even then you'll have a transition stage where both identity and higher-half maps are required at the same time, and you'll have to do explicit address juggling. But if you take care of it properly, you can reduce that to a bare minimum.
I would like to stuck with the 'slightly' bigger bootstrapping code solution, as I would be able to always extend it as I want to. As stated above, I use a kind of modular approach within the bootstrapping code, leveraging most of the functionality into C++ methods. I really like the solution presented by Higher Half Tutorial for its efficiency, however I believe it is not bad for a beginner to play around with the bootstrapping code and provide potential extensions :)

Max:
I really like your solution by having two binaries: one for the bootstrapping code and one for the actual kernel. This is the approach I started to follow. It was not that hard to create two separate binaries with appropriate linking. Right now I am stuck at the point of the transition between the bootstrapping and the actual kernel code - however I am on it ;)
---
EDIT:
I know you suggested to load the kernel binary manually, however I wanted to try this method first, before I write an actual bootloader.
---

The following provides my current linking solution (I need to mention that for readability reasons, I adopt the section and symbol names of the bootimg with a prefix .bootstrap with help of objcopy, which can be seen in the second linker script):

Script to link the bootimg:

Code: Select all

ENTRY(_start)

SECTIONS
{
    . = 0x00100000;

    .text ALIGN(0x1000) :
    {
        *(.multiboot)
        *(.text .text.* .gnu.linkonce.t.*)
    }
    
    .rodata ALIGN(0x10) :
    {
        PROVIDE (_srodata = .);
        
        start_ctors = .;
        *(SORT(.ctors*))
        end_ctors = .;

        start_dtors = .;
        *(SORT(.dtors*))
        end_dtors = .;
        
        *(.rodata*)
        *(.gnu.linkonce.r*)
        
        PROVIDE (_erodata = .);
    }

    .data ALIGN(0x10) :
    {
        PROVIDE (_sdata = .);
        
        *(.data*)
        *(.gnu.linkonce.d*)
        
        PROVIDE (_edata = .);
    }

    .bss ALIGN(0x10) :
    {
        /*
        *(.bootstrap.bss .bootstrap.bss.* .bootstrap.gnu.linkonce.b.*) 
        */
        *(.bss .bss.* .gnu.linkonce.b.*)
        *(.bootstrap.stack)
    }

    /DISCARD/ :
    {
        *(.note)
        *(.comment)
        /* *(.eh_frame) */ /* discard this, unless you are implementing runtime support for C++ exceptions. */
    }
}
Script to link the kernel:

Code: Select all

ENTRY(_start)

SECTIONS
{
    . = 0x00100000;

    .text_bootstrap ALIGN(0x1000) :
    {
        *(.bootstrap.multiboot)
        *(.bootstrap.text .bootstrap.text.* .bootstrap.gnu.linkonce.t.*)
        *(.bootstrap.rodata .bootstrap.rodata.* .bootstrap.gnu.linkonce.r.*)
        *(.bootstrap.data .bootstrap.data.* .bootstrap.gnu.linkonce.d.*)
    }

    .bss_bootstrap ALIGN(0x10) :
    {
        *(.bootstrap.bss .bootstrap.bss.* .bootstrap.gnu.linkonce.b.*)
    }
    
    . = 0xC0000000 + 0x100000;

    .text ALIGN(0x1000) : AT(ADDR(.text) - (0xC0000000 - 0x100000))
    {
        *(.text)
        *(.gnu.linkonce.t*)
    }

    .rodata ALIGN(0x1000) : AT(ADDR(.rodata) - (0xC0000000 - 0x100000))
    {
        PROVIDE (_srodata = .);
        
        start_ctors = .;
        *(SORT(.ctors*))
        end_ctors = .;

        start_dtors = .;
        *(SORT(.dtors*))
        end_dtors = .;
        
        *(.rodata*)
        *(.gnu.linkonce.r*)
        
        PROVIDE (_erodata = .);
    }

    .data ALIGN(0x1000) : AT(ADDR(.data) - (0xC0000000 - 0x100000))
    {
        PROVIDE (_sdata = .);
        
        *(.data*)
        *(.gnu.linkonce.d*)
        
        PROVIDE (_edata = .);
    }

    .bss : AT(ADDR(.bss) - (0xC0000000 - 0x100000))
    {
        PROVIDE (_sbss = .);
        
        *(COMMON)
        *(.bss)
        *(.gnu.linkonce.b*)
        
        PROVIDE (_ebss = .);
    }

    /DISCARD/ :
    {
        *(.note)
        *(.comment)
        /* *(.eh_frame) */ /* discard this, unless you are implementing runtime support for C++ exceptions. */
    }
}

Re: Linkers and the 'Load Memory Address'

Posted: Fri Apr 17, 2015 4:37 pm
by blacky
I got it to work. However, it is a little hacky at the moment: After having mapped the pages for the kernel virtual address space, I initialize a function pointer to the virtual address of the _entry function of the kernel and subsequently call it. And the actual _entry function is not exactly placed at 0xC0100000 as initially assumed but a few bytes later (0xC0100018). Since these are two different binaries linked together, my solution seems a bit messy to me right now. Maybe it is worth considering loading and parsing the kernel binary within the bootstrapping code. This way you know exactly where the entry point is going to be.

I am wondering how to find out at which address an additional multiboot module (e.g. the kernel binary, as proposed by max) is loaded to memory?

EDIT:
Again, my post was too early: I accidentally mapped the virtual address 0xC0100000 to the physical address 0x00: This means I mapped my initial bootstrapping code to the virtual address in the higher half. As a result, it was just just luck, that the address 0xC0100018 had a similar test-loop in it, which the kernel accessed. The mapping of the virtual address 0xC0100000 to an unused physical address (e.g. 0x400000) resulted however in a crash... I will check my memory mapping function...

UPDATE:
Now, my code seems to work. The issue has been that - for some reason - after an invocation of a function with a pointer as argument (call-by-reference), this pointer has not kept its changes after the function return...
Anyway, it works now. Thank you very much for your support guys :)

Re: Linkers and the 'Load Memory Address'

Posted: Sat Apr 18, 2015 2:07 am
by KemyLand
max wrote:
KemyLand wrote:That's good, if you want to still have that complex and slow bootstrap process for the Ghost Kernel. You said its slowness is because of C++, but I don't think so (**** you, the C-syndromes who don't understand C++ can be faster that the ugly **** (only on big projects) of C), I think it's due to this :| . Let's do benchmarking (how do you do that on a modular, micro-kernel?)
The bootstrap loader of Ghost only takes a very very short time to load the kernel binary, and then it completely transfers control to the kernel. The performance of Ghost itself is quite good - except for the UI. It's really hard to create a proper window management system that does event handling etc. and keeping it fast, but I've done my best and already made some optimiziations (will be seen in further releases).
That's just a suspect. Anyway, I like Ghost overall, and would like to see those optimizations :) .