Hi,
torshie wrote:I finally get your "self reference" trick.
Hmm - it's not really "my" trick. I first found out about it about 15 years ago, but I wouldn't be surprised if Intel planned the paging structures to allow this sort of thing back when they designed the 80386.
I should also provide some warnings though.
The first thing to be careful of is TLB invalidations. Basically, if you change a page directory entry, page directory pointer table entry or PML4 entry, then you need to invalidate anything in the area you've changed (like you normally would/should) but you *also* need to invalidate the area in the "master map of everything". Of course in most of these cases it's usually faster to flush the entire TLB anyway. For example, if you change a page directory entry then you'd need to do "INVLPG" up to 512 times in the address space plus once in the mapping (or flush the entire TLB), and if you change a page directory pointer table entry then you'd need to do "INVLPG" up to 262144 times in the address space plus 512 times in the mapping (or flush the entire TLB).
The next thing to consider is the "accessed" flags. For something like a page directory entry there's one "accessed" flag in RAM, plus up to 2 copies of that "accessed" flag in TLB entries (a normal TLB entry plus a TLB entry for the mapping); and it's the operating systems responsibility to ensure that the "accessed" flags don't become out-of-sync, or to ensure that out-of-sync "accessed" flags don't cause problems for the OS (mainly for the code to decide if a page should/shouldn't be sent to swap space). If you only look at the "accessed" flag in page table entries for 4 KiB pages (and in page directory entries for 2 MiB pages) then you should be able to ignore this problem; but if you check the "accessed" flag in page directory entries, page directory pointer table entries or PML4 entries for any reason, then you may need to be careful.
Finally, for a 2 MiB page the page directory entry has bit 7 set (to indicate that it is a "large" page) and bit 12 contains the highest PAT bit; but when the CPU interprets this as a page table entry (in the "master map of everything") then bit 7 becomes the highest PAT bit and bit 12 becomes part of the physical address of a page. This makes a mess. For example, imagine you've programmed the PAT like this:
Code: Select all
PAT value Cache Type
0 Write-back (default, for compatibility)
1 Write-through (default, for compatibility)
2 Uncached (default, for compatibility)
3 Uncached (default, for compatibility)
4 Write-combining (modified by the OS and used by device drivers)
5 Write-through (default, for compatibility)
6 Uncached (default, for compatibility)
7 Uncached (default, for compatibility)
Also imagine that you map some normal RAM into an address space using a 2 MiB page, and you set the PAT so that the 2 MiB page uses "write-back" caching (as you would for all normal RAM). If the 2 MiB page is at physical address 0x123000000, then you create a page directory entry that contains 0x800000001230108F (or, "
not executable, physical address = 0x0000000012300000, large page, PAT = 0 = write-back, read/write, user, present"), and when the CPU interprets this as a page table entry (in the "master map of everything") the CPU reads it as "
not executable, physical address = 0x0000000012301000, PAT = 4 = write-combining, read/write, user, present". Assuming that your kernel only uses the "master map of everything" to access paging structures, then it shouldn't access this messed up page anyway, and it should be OK (but if your kernel accidentally does write to the messed up page then good luck trying to debug what happened).
Cheers,
Brendan