Upper half kernel

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
vhg119
Member
Member
Posts: 71
Joined: Fri Aug 24, 2007 5:56 pm
Location: CA, USA

Upper half kernel

Post by vhg119 »

Hi everyone. I'm reading this (http://www.osdev.org/wiki/Higher_Half_With_GDT tutorial) and I'm trying to wrap my mind around how it works.

This is the part that I'm confused about:

Code: Select all

start:
	; here's the trick: we load a GDT with a base address
	; of 0x40000000 for the code (0x08) and data (0x10) segments
	lgdt [trickgdt]
	mov ax, 0x10
	mov ds, ax
	mov es, ax
	mov fs, ax
	mov gs, ax
	mov ss, ax

	; jump to the higher half kernel
	jmp 0x08:higherhalf

higherhalf:
	; from now the CPU will translate automatically every address
	; by adding the base 0x40000000

	mov esp, sys_stack ; set up a new stack for our kernel

	call kmain ; jump to our C kernel ;)
The trickgdt entries place the base of the segment at 0x4000000.
The author loaded the trickgdt and used it before he even set up paging.

How is this possible? Wouldn't the processor try to access the physical address 0x40000000, which would result in some error?

Vince
jnc100
Member
Member
Posts: 775
Joined: Mon Apr 09, 2007 12:10 pm
Location: London, UK
Contact:

Post by jnc100 »

vhg119 wrote:Wouldn't the processor try to access the physical address 0x40000000, which would result in some error?
Yes, assuming you ever try to access offset 0x0 from you code. A higher half kernel (actually in this case its the upper 1GB, so higher quarter is probably a more appropriate term) is linked such that the code and data starts at 0xC0000000. In a segment starting at base 0x40000000, offset 0xC0000000 = virtual (physical without paging) address 0x0, because 0x40000000 + 0xC0000000 causes a carry in a 32-bit integer back to 0x0.

Later on, of course, you accomplish the same with paging, and can set your segments bases back to 0.

Regards,
John.
User avatar
JAAman
Member
Member
Posts: 879
Joined: Wed Oct 27, 2004 11:00 pm
Location: WA

Post by JAAman »

jnc100 is correct, though i have to say, i dont like the 'GDT trick' method, as it is a waste (its unnecessary) -- the only reason to use it, is to reduce the amount of ASM code you have to write -- but it generally requires more ASM than skipping it altogether
How is this possible? Wouldn't the processor try to access the physical address 0x40000000, which would result in some error?
you will never get an error from trying to read/write memory that doesnt exist -- this is important to understand, you will not get valid results, but you wont get an error -- although, if you miss RAM you may hit hardware, which can (although rarely) permanently destroy your computer -- so make sure you know exactly where your RAM is located, and dont allow anything to write to areas if you dont know what is located there (this is why i dont allow my Os to use more than 64MB (less on older systems) on any system that doesnt support e820 -- all the other memory detection methods are allowed to return misleading results under certain (rare) conditions
vhg119
Member
Member
Posts: 71
Joined: Fri Aug 24, 2007 5:56 pm
Location: CA, USA

Post by vhg119 »

I guess my questions is, how would the code in that tutorial work?

When the code does this
jmp 0x08:higherhalf
and the descriptor at 0x08 indicates that the base is at 0x40000000, wouldn't he be jumping effectively to 0x4000000:higherhalf?

Looking at the code, it doesn't look like he copied the kernel to 0x40000000, so there isn't anything at that address.

Furthermore, paging isn't enabled yet, so 0x40000000 is mapped directly to the identical physical address.

And, 'higherhalf' is in the same module as start and I don't see any 'org' statements anywhere, I don't think he used a linker trick.
jnc100
Member
Member
Posts: 775
Joined: Mon Apr 09, 2007 12:10 pm
Location: London, UK
Contact:

Post by jnc100 »

Look at the linker script again...

Regards,
John.
vhg119
Member
Member
Posts: 71
Joined: Fri Aug 24, 2007 5:56 pm
Location: CA, USA

Post by vhg119 »

I'm looking at it again right now.

If I'm interpreting this correctly, and I probably am not, he sets the VMA of the .text section to 0xC0000000. Then he sets the LMA of the .text section to 0x100000 + sizeOfSetupSection.

I'm reading this to learn more about LD:
http://jamesthornton.com/redhat/linux/E ... tions.html

However, I couldn't find a simple (for me to understand) explanation of what the VMA and the LMA is.

Could someone please explain?
vhg119
Member
Member
Posts: 71
Joined: Fri Aug 24, 2007 5:56 pm
Location: CA, USA

Post by vhg119 »

I just saw that the output format is ELF. I was under the impression that kernels needed be a binary format. Otherwise, what system is loading the kernel specified by the ELF header?
User avatar
AJ
Member
Member
Posts: 2646
Joined: Sun Oct 22, 2006 7:01 am
Location: Devon, UK
Contact:

Post by AJ »

vhg119
Member
Member
Posts: 71
Joined: Fri Aug 24, 2007 5:56 pm
Location: CA, USA

Post by vhg119 »

Crap. That makes sense. I'm using my own bootloader though.

Is there a way to make a Higher Half kernel using binary output file format?
User avatar
AJ
Member
Member
Posts: 2646
Joined: Sun Oct 22, 2006 7:01 am
Location: Devon, UK
Contact:

Post by AJ »

I think that if you want to do that, you will need to set up the GDT trick in your boot loader (or why not just set up paging?), then jump to your higher-half kernel.

IIRC, if you try to make a flat binary with bits linked to run at (say) 0x100000 and 0xC0000000, the assembler will attempt to pad the space in between (all 3GB worth) with zeros.

Cheers,
Adam
vhg119
Member
Member
Posts: 71
Joined: Fri Aug 24, 2007 5:56 pm
Location: CA, USA

Post by vhg119 »

AJ wrote:if you try to make a flat binary with bits linked to run at (say) 0x100000 and 0xC0000000, the assembler will attempt to pad the space in between (all 3GB worth) with zeros.
That sucks
vhg119
Member
Member
Posts: 71
Joined: Fri Aug 24, 2007 5:56 pm
Location: CA, USA

Post by vhg119 »

I think I'm gonna give up on the higher half kernel for now and just make it a lower half kernel.

I just don't know enough about LD and the linking process and theories yet.
User avatar
AndrewAPrice
Member
Member
Posts: 2309
Joined: Mon Jun 05, 2006 11:00 pm
Location: USA (and Australia)

Post by AndrewAPrice »

vhg119 wrote:
AJ wrote:if you try to make a flat binary with bits linked to run at (say) 0x100000 and 0xC0000000, the assembler will attempt to pad the space in between (all 3GB worth) with zeros.
That sucks
Use ELF ;)
My OS is Perception.
frank
Member
Member
Posts: 729
Joined: Sat Dec 30, 2006 2:31 pm
Location: East Coast, USA

Post by frank »

AJ wrote:I think that if you want to do that, you will need to set up the GDT trick in your boot loader (or why not just set up paging?), then jump to your higher-half kernel.

IIRC, if you try to make a flat binary with bits linked to run at (say) 0x100000 and 0xC0000000, the assembler will attempt to pad the space in between (all 3GB worth) with zeros.

Cheers,
Adam
Not True. I use a binary kernel. The first 4kb of my kernel is an assembly stub that sets up protected mode and paging and jumps to the rest of the kernel. Here is my linker script:

Code: Select all

ENTRY( kernel_start )

OUTPUT_FORMAT( binary )

SECTIONS
{
        . = 0x1000;
        
        .start :
        {
                *(.start)

                . = ALIGN( 4096 );
        }

        .text 0xD0000000 + SIZEOF( .start ) : AT( ADDR( .start ) + SIZEOF( .start ) )
        {
                *(.text)
                . = ALIGN( 4096 );

        }
        
        .data ADDR( .text ) + SIZEOF( .text ) : AT( LOADADDR( .text ) + SIZEOF( .text ) )
        {
                *(.data)

        }
        
        . = ALIGN( 4096 );
        
        .bss ALIGN( ADDR( .data ) + SIZEOF( .data ), 4096 ): AT( LOADADDR( .data ) + SIZEOF( .data ) )
        {
                *(.bss .bss.* .gnu.linkonce.b.*)
                *(COMMON)

        }
}
As I see it VMA is the location where the code thinks its going to be running at and LMA is the actually location where it will be located. Of course most of the time they are the same. You only really need to worry about the LMA is when you need to influence the placement of the sections in the output, which is useful when you are trying to avoid a 3gb file.
vhg119
Member
Member
Posts: 71
Joined: Fri Aug 24, 2007 5:56 pm
Location: CA, USA

Post by vhg119 »

ENTRY( kernel_start )

OUTPUT_FORMAT( binary )

SECTIONS
{
. = 0x1000;

.start :
{
*(.start)

. = ALIGN( 4096 );
}

.text 0xD0000000 + SIZEOF( .start ) : AT( ADDR( .start ) + SIZEOF( .start ) )
{
*(.text)
. = ALIGN( 4096 );

}

.data ADDR( .text ) + SIZEOF( .text ) : AT( LOADADDR( .text ) + SIZEOF( .text ) )
{
*(.data)

}

. = ALIGN( 4096 );

.bss ALIGN( ADDR( .data ) + SIZEOF( .data ), 4096 ): AT( LOADADDR( .data ) + SIZEOF( .data ) )
{
*(.bss .bss.* .gnu.linkonce.b.*)
*(COMMON)

}
}
Thanks, Frank. I'm still trying to learn more about LD. Does the part I've highlighted above basically say...

Bind address resolutions for this section as if the address began at 0xD0000000 + sizeof(.start), BUT produce the output where the .text section is positioned right after .start... In other words, don't pad the space between .start and .text?
Post Reply