Page 1 of 1

Clarifying higher half kernel approach & the GDT trick

Posted: Sat Jun 20, 2020 11:54 am
by LukeyTheKid
I have been trying to figure out the right approach for implementing a higher-half kernel with the GDT/IDT properly set and paging enabled. My current implementation works, but I wanted to clear a few things up...and hopefully provide an answer for someone googling this question in the future :)

Just to preface - this is a lot of questions, and I'm definitely not expecting one person to provide some huge in-depth explanation. I've done a fair amount of reading the wiki and going over relevant Intel docs for these operations so I have a rough idea of how this works, but it's a lot to take in. What I'm trying to accomplish here is just get a slightly more holistic view of memory setup.

The way I've set up my paging (a-la barebones higher half) is to:
  • * Link the kernel at 0xC0000000
    * Set up a page directory,
    * Set up the first page table
    * Identity-map the kernel to that page table
    * Use that table set the PD entry at both the first and 768th PD entries (768 representing address 0xC0000000).
    * Enable Paging
    * Reset the first page directory entry to 0, now that EIP is using virtual addresses


Here is what I have been trying to figure out:

(1) GDT & IDT Addressing

Before setting up paging, I had already implemented the GDT/IDT setup in my kernel, which at the time used linear addresses. I was under the impression that the GDTR needs a linear address (at least, that's what it seems like from the Intel Manual description, and online guides I've read say the same thing). But after I implemented paging, I wanted to see what would happen with my current setup, since I don't set the GDT and IDT until my kernel main routine -- which occurs after paging. I booted the kernel in Bochs, and sure enough, I'm loading the GDTR/IDTR with virtual addresses (e.g. for GDT, the address is 0xc0105020). But it seems to work; here is my GDT after setting the segment registers and performing a long jump:

Code: Select all

Global Descriptor Table (base=0x00000000c0105040, limit=39):
GDT[0x0000]=??? descriptor hi=0x00000000, lo=0x00000000
GDT[0x0008]=Code segment, base=0x00000000, limit=0xffffffff, Execute/Read, Non-Conforming, 32-bit
GDT[0x0010]=Data segment, base=0x00000000, limit=0xffffffff, Read/Write
GDT[0x0018]=Code segment, base=0x00000000, limit=0xffffffff, Execute/Read, Non-Conforming, 32-bit
GDT[0x0020]=Data segment, base=0x00000000, limit=0xffffffff, Read/Write
This surprised me a bit, but I'm not going to complain when things work! However, I wanted to see what the general opinion is on the "right" thing to do - even though this works for now, I don't want to get into trouble down the line, and in general I'm curious to see what's going on here. I think I know the answers to some of these, but as I said, for completeness sake I'd like to try and map out the whole picture.


(A) Does order matter in terms of setup? Should I be setting GDT and/or IDT first? Currently I setup my GDT and IDT in C (for the most part), calling them from kernel main after paging. I don't set up my stack until paging is enabled, and I don't think doing so earlier would accomplish anything -- since I link at 0xC0000000, the stack top value is 0xc0109070. I assume that calling C functions with that invalid linear stack address before enabling paging would crash. So I think moving those GDT/IDT setup calls to occur before paging could be tricky. I could re-implement my GDT setup in assembly without too much trouble, but the IDT would be tougher.

(B) After I enable paging, I suppose I could still get the physical address of the GDT/IDT if I wanted to by taking their virtual address and manually extracting PD entry --> PT entry --> physical address. If I keep my current setup order, is it better to perform this bit-shifting and use the physical address to load the registers, even though the virtual address seems to be working properly?

(C) If I do set up GDT/IDT before paging, using their linear addresses, do I need to manage them in any way after enabling paging? I should note that after I enable paging, I un-map the identity page table (PD entry 0), so that 1:1 mapping is gone -- virtual address 0x00100000 is no longer mapped to that physical memory. Will segment operations ignore paging and just go straight to the physical address loaded in the GDTR? I would think so, but currently there's a virtual address in there and that is working...
.
(D) On that note, since my current setup DOES work - how is the GDTR working with a virtual address? Or does the lgdt/lidt instruction itself perform the translation from virtual to physical, and just load the physical address?


(2) Does the GDT trick accomplish anything not satisfied by the approach described above?

While I'm on the subject of setting up the GDT for higher half kernels, I wanted to clear up a few thoughts on the GDT trick.Through my own searches on the subject of higher-half kernels, I keep running into references to this approach (though it seems like the actual HigherHalfGDT page, from ~2005, is gone. Is this now seen as a bad approach?).

From what I gather, the idea is that by setting the GDT base of the kernel code/data segments to 0x40000000 instead of 0x0, any higher-half virtual memory references above 0xC0000000 will wrap back around because of 32-bit overflow (e.g. 0x40000000:0xC0100000 --> 0x00100000, which gives us the physical address of our kernel code). This way, we just have to do the identity-mapping of the kernel, and don't have to bother setting the PD[768] entry / going through my approach above. I just wanted to make sure I had the right idea, and see if there is any other benefit. Several people refer to it as a "hack", the page is gone - and it doesn't seem like it saves that much work/complexity - so I just wanted to see what the general consensus was in 2020.


Just to re-iterate -- I know this is a lot of questions! Just want to try and hash this out a bit. Thanks in advance, and I promise I am working on answering these myself.

Re: Clarifying higher half kernel approach & the GDT trick

Posted: Sat Jun 20, 2020 2:01 pm
by LukeyTheKid
I was under the impression that the GDTR needs a linear address
Ah, this is where a lot of confusion comes from. I was correct that it needs a linear address, but the virtual address is a linear address (I was thinking it needs a physical address).
(B) After I enable paging, I suppose I could still get the physical address of the GDT/IDT if I wanted to by taking their virtual address and manually extracting PD entry --> PT entry --> physical address. If I keep my current setup order, is it better to perform this bit-shifting and use the physical address to load the registers, even though the virtual address seems to be working properly?

(C) If I do set up GDT/IDT before paging, using their linear addresses, do I need to manage them in any way after enabling paging? I should note that after I enable paging, I un-map the identity page table (PD entry 0), so that 1:1 mapping is gone -- virtual address 0x00100000 is no longer mapped to that physical memory. Will segment operations ignore paging and just go straight to the physical address loaded in the GDTR? I would think so, but currently there's a virtual address in there and that is working...
With the above in mind, I don't think there is anything to be done -- in fact I think that providing a physical address might be the one thing that doesn't work, if paging is indeed applied to the address stored in the GDTR.

Re: Clarifying higher half kernel approach & the GDT trick

Posted: Sat Jun 20, 2020 3:29 pm
by nullplan
The address in the GDTR doesn't actually matter until a segment register gets reloaded. Such a reload will happen on external interrupt or exception, on interrupt return, and when segment registers are explicitly loaded with a value. So the meaning of the address in the GDTR at the time of loading the GDTR is immaterial. It only starts mattering once actually referenced.

When the value is referenced, it is decoded like all other linear addresses, according to the current processor mode. That is, before enabling paging, the address is used as physical address directly, and after that, it is used as virtual address.
LukeyTheKid wrote:(A) Does order matter in terms of setup? Should I be setting GDT and/or IDT first?
Order is everything: You must set up paging first. Before that, addresses are not going to mean what the linker thinks they mean.
LukeyTheKid wrote:Currently I setup my GDT and IDT in C (for the most part), calling them from kernel main after paging. I don't set up my stack until paging is enabled, and I don't think doing so earlier would accomplish anything -- since I link at 0xC0000000, the stack top value is 0xc0109070. I assume that calling C functions with that invalid linear stack address before enabling paging would crash.
Completely correct. The C code will try to access the stack and that will fail while the address is actually MMIO. And even if it isn't, you would move the stack once paging is enabled, and that is a bit of a problem.
LukeyTheKid wrote:I just wanted to see what the general consensus was in 2020.
Oof, you are asking for consensus in a web forum made of very disparate people. I personally don't think the hack accomplishes much. It allows you to set up the GDT and IDT before enabling paging, sure, but most will simply do it the other way around. Once paging is working (enough to map the kernel), there is no great need to debug it. And before that, tools like the Bochs debugger can help. However, my perspective might be biased from the fact that I'm working on a pure 64-bit OS, where nonzero base addresses are impossible, and address space is plentiful.

Re: Clarifying higher half kernel approach & the GDT trick

Posted: Sat Jun 20, 2020 3:36 pm
by thewrongchristian
LukeyTheKid wrote:
I was under the impression that the GDTR needs a linear address
Ah, this is where a lot of confusion comes from. I was correct that it needs a linear address, but the virtual address is a linear address (I was thinking it needs a physical address).
I think in x86 parlance, the virtual address is the address within a segment. The linear address is the result of the virtual address being offset with the segment base, then that linear address is converted to a physical address using the page table (if enabled.)

So strictly, the virtual address is not the linear address. The GDTR does need the linear address. Of course, in most sane environments, segments start at offset 0, so virtual address == linear address.
LukeyTheKid wrote:
(B) After I enable paging, I suppose I could still get the physical address of the GDT/IDT if I wanted to by taking their virtual address and manually extracting PD entry --> PT entry --> physical address. If I keep my current setup order, is it better to perform this bit-shifting and use the physical address to load the registers, even though the virtual address seems to be working properly?

(C) If I do set up GDT/IDT before paging, using their linear addresses, do I need to manage them in any way after enabling paging? I should note that after I enable paging, I un-map the identity page table (PD entry 0), so that 1:1 mapping is gone -- virtual address 0x00100000 is no longer mapped to that physical memory. Will segment operations ignore paging and just go straight to the physical address loaded in the GDTR? I would think so, but currently there's a virtual address in there and that is working...
With the above in mind, I don't think there is anything to be done -- in fact I think that providing a physical address might be the one thing that doesn't work, if paging is indeed applied to the address stored in the GDTR.
The reason it appears to work is because you've not reloaded any segment cache registers. The GDT is only referenced when you change a segment register to trigger reloading the segment descriptor. At that point, you'd have, I think, a triple fault as the CPU would not be able to reload its segment information.

So, early on in your higher half runtime, you'll want to reload your GDTR with the new higher half linear address of your GDT. Do it before you enable interrupts or otherwise do anything that might trigger a segment reload.

Re: Clarifying higher half kernel approach & the GDT trick

Posted: Sun Jun 21, 2020 8:39 am
by Octocontrabass
LukeyTheKid wrote:(A) Does order matter in terms of setup?
Yes, and it sounds like the way you're doing things now is fine.
LukeyTheKid wrote:(2) Does the GDT trick accomplish anything not satisfied by the approach described above?
The GDT trick makes it possible to execute code at the correct virtual address before you've enabled paging.
LukeyTheKid wrote:Is this now seen as a bad approach?
The biggest flaw I see is that it's not applicable to 64-bit mode, so when you decide you need a 64-bit kernel you have to throw it away and set up page tables before jumping to your higher-half code like everyone else.

It also has the potential to cause compatibility issues. Neither Intel nor AMD explicitly describe what happens when the virtual address translates to a linear address above 4GB.

It's also somewhat limiting. For example, if you want to have an especially large gap in your kernel's virtual address space for some reason, you need enough contiguous physical memory to cover that gap.

(And just to add insult to injury, the example inline assembly is missing a clobber.)

Re: Clarifying higher half kernel approach & the GDT trick

Posted: Sun Jun 21, 2020 9:54 am
by LukeyTheKid
I think in x86 parlance, the virtual address is the address within a segment. The linear address is the result of the virtual address being offset with the segment base, then that linear address is converted to a physical address using the page table (if enabled.)

So strictly, the virtual address is not the linear address. The GDTR does need the linear address. Of course, in most sane environments, segments start at offset 0, so virtual address == linear address.
I use GRUB as my bootloader, so when my kernel starts it does begin with a segment offset of 0 (setup by GRUB - I have not touched the GDT at this point), so I guess that makes sense that the virtual address I'm passing is in effect the linear address. I suppose if I used my own bootloader I would have to make sure I set it up this way as well.

It seems that it's rare to explicitly use a logical address when writing code, and the DS/CS/SS are just implied unless you're doing something like a long jump to load the CS. In other words, if I'm directly loading from an address -- either from C with de-referencing a pointer, or doing something like 'movl 0x1234, %eax' -- the only component I'm actually referring to is the offset, and the base will just be whatever the relevant GDT entry holds. The entry depends on whichever selector is provided based on the operation (CS for code / DS for data / SS for stack).

If this is true, then it seems like once we're in protected mode, we have a bit of of a chicken-and-egg problem with the GDT - any address we refer to will have that implicit selector, so if the corresponding base is non-zero, then what we're passing to LGDT is not actually a linear address but a logical one. So how do we provide a linear address if we're working with logical addresses? Even if we DO have a zero-base in the relevant descriptor, we're still technically providing a logical address (which conveniently happens to be translated into an identical physical address).

Following that train of thought, I'm guessing the answer is something along these lines: Yes, the GDT Register needs a linear address. However, the "Load GDT" instruction takes a logical address, and this operation actually performs the step of transforming the given logical address into a linear address, which is loaded into the GDT Register.

On a side note - I guess for GRUB itself, it starts off in real mode so addresses are just physical, and it doesn't have to worry about any of this?
The reason it appears to work is because you've not reloaded any segment cache registers. The GDT is only referenced when you change a segment register to trigger reloading the segment descriptor. At that point, you'd have, I think, a triple fault as the CPU would not be able to reload its segment information.

So, early on in your higher half runtime, you'll want to reload your GDTR with the new higher half linear address of your GDT. Do it before you enable interrupts or otherwise do anything that might trigger a segment reload.
^ Just so I'm clear on what you're saying - this is in reference to the idea of using physical addresses to load the GDT, correct? Currently I set up paging first, and only after I'm in my higher half runtime do I load the GDTR with the virtual address of my GDT (which is, as you mentioned, effectively linear since it has a zero offset). Then I reload all of the segment registers. Or are you saying there's an issue with the way I'm doing it now?

Re: Clarifying higher half kernel approach & the GDT trick

Posted: Sun Jun 21, 2020 11:11 am
by Octocontrabass
LukeyTheKid wrote:So how do we provide a linear address if we're working with logical addresses?
The operand to the LGDT instruction is a logical address. The value in memory at that logical address is the linear address to put into the GDTR. (Actually there's two values in memory, since the GDTR needs both the base and the limit.)
LukeyTheKid wrote:On a side note - I guess for GRUB itself, it starts off in real mode so addresses are just physical, and it doesn't have to worry about any of this?
Real mode has segmentation all the time, just like protected mode.

Re: Clarifying higher half kernel approach & the GDT trick

Posted: Sun Jun 21, 2020 11:17 am
by LukeyTheKid
If this is true, then it seems like once we're in protected mode, we have a bit of of a chicken-and-egg problem with the GDT - any address we refer to will have that implicit selector, so if the corresponding base is non-zero, then what we're passing to LGDT is not actually a linear address but a logical one. So how do we provide a linear address if we're working with logical addresses? Even if we DO have a zero-base in the relevant descriptor, we're still technically providing a logical address (which conveniently happens to be translated into an identical physical address).

Following that train of thought, I'm guessing the answer is something along these lines: Yes, the GDT Register needs a linear address. However, the "Load GDT" instruction takes a logical address, and this operation actually performs the step of transforming the given logical address into a linear address, which is loaded into the GDT Register.
After doing some more reading, I think I see where I got confused - I was thinking in terms of the address of the GDT descriptor passed as argument to the lgdt instruction, not the address of the GDT itself.
So when I call lgdt, the address of the descriptor is a logical address, but the value of the offset contained in that 6-byte structure (aka the address of the GDT) is still a linear address, which is what's important.

So in my C code, I setup my gdt_entries table, and create the descriptor, which contains the linear address of the GDT:

Code: Select all

static void init_gdt(){
  gdt_ptr.size = (sizeof(gdt_entry_t) * 5) - 1;
  gdt_ptr.offset = (uint32_t)&gdt_entries[0];

  // Add entries to the table
  gdt_set_gate(0, 0x0, 0x0, 0x0, 0x0);             // Null entry
  gdt_set_gate(1, 0x0, 0xFFFFFFFF, 0x9A, 0xCF);  // Kernel Code
  gdt_set_gate(2, 0x0, 0xFFFFFFFF, 0x92, 0xCF);  // Kernel data
  gdt_set_gate(3, 0x0, 0xFFFFFFFF, 0xFA, 0xCF);  // User space code
  gdt_set_gate(4, 0x0, 0xFFFFFFFF, 0xF2, 0xCF);  // User space data

  gdt_flush((uint32_t)&gdt_ptr);
}

When I call lgdt, it's interpreting the value I pass it as a logical address, but that doesn't matter, because the value of the offset of the descriptor is still linear.

Code: Select all

gdt_flush:
	# Disable Interrupts
	cli

	# Load the GDT
	xchgw %bx, %bx
	movl 4(%esp), %eax
	lgdt (%eax)

	# Re-enable interrupts
	#sti

	# Load Data Segment Registers
	movw $0x10, %ax
	movw %ax, %ds
	movw %ax, %es
	movw %ax, %fs
	movw %ax, %gs
	movw %ax, %ss

	# Longjump to load code register
	jmp $0x08,$.flush
.flush:
	ret
In that case, I don't even know how to pass a logical address as the GDT address. Maybe something like:

Code: Select all

# I don't know the exact syntax, but this is the general idea

gdt_ptr:
.long $0x8, $gdt_entries
.short gdt_size
...
...
lgdt gdt_ptr

Re: Clarifying higher half kernel approach & the GDT trick

Posted: Sun Jun 21, 2020 11:23 am
by Octocontrabass
LukeyTheKid wrote:In that case, I don't even know how to pass a logical address as the GDT address.
You have to convert it to a linear address by subtracting the appropriate segment's base address.

Re: Clarifying higher half kernel approach & the GDT trick

Posted: Sun Jun 21, 2020 11:26 am
by LukeyTheKid
@Octocontrabass Just saw your reply, was still editing my response :) Thanks for the confirmation.
Real mode has segmentation all the time, just like protected mode.
Ah you're right, it does, just using absolute segment bases. I was just thinking in terms of it operating independently of the GDT, which is not set up at that point.