Page 1 of 1

Long Mode Paging [TLB Will Not Flush/Need Hint]

Posted: Sun Sep 27, 2009 5:57 pm
by dosfan
Hi all,

I've been banging my head against a wall with this one. Hopefully someone will point out the obvious I'm missing.

Before entering long mode I identity map the first 4MB using 2x2MB pages. I also map the same PDPT high in the PML4 for higher half. This works.

Once my kernel is in C in the higher half, I parse GRUBs' memory map and attempt to map the rest of the RAM. At the same time removing the 4MB of identity mappings. This bit doesn't work. By doesn't work I mean I can't seem to get the TLB to flush. No, I'm not using global pages and I've even tried turning global pages off in CR4. Reloading CR3 or invlpg doesn't seem to shift the mappings!! WTH? I've only tested in BOCHS and QEmu.

Below is some code just to illustrate what I'm trying. Please, someone give me a "one liner" and point out how stupid I'm being. More code upon request.

Code: Select all

extern volatile uint64 *_boot_pml4;
extern volatile uint64 *_boot_pdpt;
extern volatile uint64 *_boot_pd0;

Code: Select all

	address = 0;
	
	//Map the first 1GB with 2MB pages 
	for(i = 0 ; i < 512 ; i++)
	{
		_boot_pd0[i] = address + 0x87;

		address += (2*1024*1024);
	}

	
	/* Remove the identity mapping used when we first enabled long mode */
	_boot_pdpt[0] = 0;
	_boot_pml4[0] = 0;


	/* reload CR3, flush TLB */
	__asm__ volatile ("movq %0, %%cr3" :: "r" ((uint64)&_boot_pml4-KERNEL_VMA): "memory");

	//amd64_invlpg(0x1ffff);
	//amd64_invlpg(0x20000);
//	__asm__ volatile("movq %cr3, %rax");
//	__asm__ volatile("movq %rax, %cr3");
//	__asm__("jmp 1f\n1:");

	uint8 *p = (uint8*)0x1;
	*p = 1; // I would expect this to fault
OT: ACPI really is a big pile of smouldering doo.

Re: Long Mode Paging [TLB Will Not Flush/Need Hint]

Posted: Mon Sep 28, 2009 12:26 am
by Brendan
Hi,

I'd assume that at least one of the addresses you're using (e.g _boot_pd0, _boot_pdpt, etc) isn't correct, and your code isn't modifying the paging structures because of this (and the TLB invalidation doesn't seem to work because the old paging structures are the same as the new paging structures).

Probably the best way to figure out if the paging structures are being updated correctly is to use something like Bochs debugger. For e.g. stop the emulator and see what is in CR3, then examine the physical memory at CR3 to see what the PLM4 contains, then find the physical address of the PDPT from the PLM4 and see what the PDPT contains, etc.

This doesn't have anything to do with your current problem; but be aware that INVLPG may or may not work as expected when the page you're invalidating is the 2 MiB page at (physical address) 0x00000000. To improve performance, modern CPUs use the TLBs to find the "caching type" for the corresponding page (so when you access a virtual address the CPU looks up the TLB entry to find the physical address and the caching type; and doesn't need to check the PAT, MTRRs, etc). The area from 0x00000000 to 0x000FFFFF contains lots of different things (e.g. RAM, display memory, ROM, etc) and uses several different "caching types" (e.g. write-back, uncacheable, write-protected); and the CPU will/should automatically split this 2 MiB page into 512 4 KiB pages (so instead of using one TLB entry for a large page with unknown/mixed caching, you end up using 512 TLB entries for small pages with the correct caching for each page). Because of this, for this physical page some CPUs might only invalidate 4 KiB and might not invalidate the entire 2 MiB. I'd also be worried about CPUs that don't automatically split the 2 MiB page up, and assume that the entire area is uncachable or write-back (for e.g.) when it's not (which can cause problems like poor performance, and maybe cache-coherency problems in multi-CPU systems). In general, using 2 MiB pages is a bad idea unless the entire 2 MiB uses the same caching (e.g. it's all "write-back" or it's all "uncacheable", and doesn't contain a mixture).


Cheers,

Brendan

Re: Long Mode Paging [TLB Will Not Flush/Need Hint]

Posted: Mon Sep 28, 2009 3:31 am
by dosfan
Thanks for the advice Brendan, I went back and dumpped all my symbol addresses and contents of the the PML4 and PDPT

Stangely the PML4 and PDPT looked zeroed! It must have been semi-sensible when I entered long mode, so I started moving things around incase of courruption.

Turns out it was my external symbol declarations. As you said I was never actually modifying the PML4 :oops:

Code: Select all

extern volatile uint64 *_boot_pml4;
extern volatile uint64 *_boot_pdpt;
extern volatile uint64 *_boot_pd0;
Became

Code: Select all

extern volatile uint64 *_boot_pml4[];
extern volatile uint64 *_boot_pdpt[];
extern volatile uint64 *_boot_pd0[];


Re: Long Mode Paging [TLB Will Not Flush/Need Hint]

Posted: Mon Sep 28, 2009 3:43 am
by AJ
Hi,

Code: Select all

__asm__ volatile ("movq %0, %%cr3" :: "r" ((uint64)&_boot_pml4-KERNEL_VMA): "memory");
Did you mean to dereference _boot_pml4? In effect, you are loading CR3 with the first PML4 entry instead of a pointer to the PML4.

Code: Select all

extern volatile uint64 *_boot_pml4[];
extern volatile uint64 *_boot_pdpt[];
extern volatile uint64 *_boot_pd0[];
That looks a bit dodgy to me - in effect you are declaring "uint64 **", which may be why the dereferencing above is now working.

Cheers,
Adam

Re: Long Mode Paging [TLB Will Not Flush/Need Hint]

Posted: Mon Sep 28, 2009 4:36 am
by dosfan
Hi
Did you mean to dereference _boot_pml4? In effect, you are loading CR3 with the first PML4 entry instead of a pointer to the PML4.
No -- Doh. I declared the _boot_xxxx symbols as pointers hence why I had to dereference it to "make it work". Below is infact what I intended!

AJ, spot on.

Code: Select all

extern volatile uint64 _boot_pml4[];
extern volatile uint64 _boot_pdpt[];
extern volatile uint64 _boot_pd0[];

Code: Select all

__asm__ volatile ("movq %0, %%cr3" :: "r" (((uint64)_boot_pml4)-KERNEL_VMA): "memory");

Re: Long Mode Paging [TLB Will Not Flush/Need Hint]

Posted: Mon Sep 28, 2009 4:50 am
by AJ
Hi,

When you're reviewing your own code, that's quite an easy one to miss - sometimes you just need another pair of eyes! Glad it sorted it :)

Cheers,
Adam