Two ways of flushing TLB in IA32 and two different behaviors

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
User avatar
thepowersgang
Member
Member
Posts: 734
Joined: Tue Dec 25, 2007 6:03 am
Libera.chat IRC: thePowersGang
Location: Perth, Western Australia
Contact:

Re: Two ways of flushing TLB in IA32 and two different behav

Post by thepowersgang »

According to the Intel Manuals, vol 3A "Paging->Caching Translation Information" (4.10.4.1 in the June 2010 edition), a write to CR3 will cause a TLB flush unless PCIDs are in use (CR4.PCIDE=1)
Kernel Development, It's the brain surgery of programming.
Acess2 OS (c) | Tifflin OS (rust) | mrustc - Rust compiler
Currently Working on: mrustc
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Two ways of flushing TLB in IA32 and two different behav

Post by Brendan »

Hi,
sv75 wrote:
giovanig wrote:
ASMV("movl %cr3,%eax");
ASMV("movl %eax,%cr3");
So, it looks like writing cr3 to itself does not flush TLB at all.
It was stated explicitly in i386 manual according to http://www.rhinocerus.net/forum/lang-as ... h-tlb.html.
Of course the 80386 didn't support global pages (or the "invlpg" instruction), so the 80386 manual just says "reload CR3" and nothing else.
sv75 wrote:Any idea how to flush the entire TLB after existing PD have been changed?
Reloading CR3 does flush all TLB entries, except for entries marked as "global". If you're changing (rather than reloading) CR3 this is exactly what you want - the global pages are the same in all virtual address spaces (e.g. only used for kernel space) and therefore don't need to be flushed when you switch from one virtual address space to another.

For a more complete summary (ignoring PCID, which is a relatively new thing); there's 5 cases:
  • You need to invalidate a small number of pages, and the CPU is 80486 or later: Use INVPLG to avoid performance problems caused by flushing TLB entries for no reason. For this case it doesn't matter if the pages are kernel space or user space.
  • You need to invalidate a small number of pages, and the CPU is 80386: Reload CR3 because you have no choice. For this case it doesn't matter if the pages are kernel space or user space.
  • You need to invalidate a large number of pages in user space, or switch from one virtual address space to another (e.g. task switch): Reload CR3 because none of those pages should be global. For this case it doesn't matter if the CPU supports global pages or not.
  • You need to invalidate a large number of pages in kernel space, and the CPU is 80486 or later: Clear the PGE flag in CR4, then set it again. This will flush all TLB entries (including global entries).
  • You need to invalidate a large number of pages in kernel space, and the CPU is 80386: Reload CR3 because you have no choice.
thepowersgang wrote:According to the Intel Manuals, vol 3A "Paging->Caching Translation Information" (4.10.4.1 in the June 2010 edition), a write to CR3 will cause a TLB flush unless PCIDs are in use (CR4.PCIDE=1)
PCID complicates things more (as TLB can store entries for different virtual address spaces and you need to add a layer of "ID management"); but also gives you the "INVPCID" instruction that's capable of invalidating all TLB entries, invalidating TLB entries for a specific context, or invalidating non-global TLB entries for a specific context.

I'm still a little unsure of PCID - I haven't messed with it yet (and for modern multi-CPU systems the additional "TLB shootdown" overhead it'd cause worries me).


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Post Reply