OSDev.org

Posted: **Thu Jul 28, 2011 6:13 am**

This in reference to InteI's Software Developer’s Manual (Order Number: 325384-039US May 2011), the section 4.10.4.4 "Delayed Invalidation" describes a potential delay in invalidation of TLB entries which can cause unpredictable results while accessing memory whose paging-structure entry has been changed.

The manual says ... "Required invalidations may be delayed under some circumstances. Software devel- opers should understand that, between the modification of a paging-structure entry and execution of the invalidation instruction recommended in Section 4.10.4.2, the processor may use translations based on either the old value or the new value of the paging-structure entry. The following items describe some of the potential conse- quences of delayed invalidation: If a paging-structure entry is modified to change the R/W flag from 0 to 1, write accesses to linear addresses whose translation is controlled by this entry may or may not cause a page-fault exception."

Let us suppose a simple case, where a page-strucure entry is modified (r/w flag is flipped from 0 to 1) for a linear address and after that the corresponding TBL invalidation instruction is called immediately. My question is--as a consiquence of delayed invalidation of TLB s it possible that even after calling invalidation of TLB a write access to the linear address in question doesn't fault (page fault)?

Or is the "delayed invalidation" can only cause unpredictable results when "invalidate" instruction for the linear address whose page-structure has changed has not been issued?

Posted: **Thu Jul 28, 2011 6:44 am**

delayed or lazy invaliation is an OS feature: When the OS knows that it has changed the page permissions in a way in which a stale TLB entry does not pose a security risk (e.g. change a page from R to R/W, or map in a page), it can avoid doing a TLB invalidation in order to save time. If the process then does access the page and gets a stale entry, the #PF handler detects this situation, issues the INVLPG, and restarts the instruction.

This isn't really a worthwhile thing to do on a uniprocessor machine; INVLPG is expensive but not that expensive, and most page operations for which lazy invalidation is a candiate will fall into one of the following categories

Reactionary events, where the page in/permission change is in response to a page fault. In this case it is highly likely that after the return from the #PF the action would fault again.
Speculative events, where the page in/permission change is because the kernel or OS thinks the application is likely to touch a page in the near future. If its likely to touch a page, then lazily invalidation is counter productive

(Also remember that taking an interrupt or exception is expensive; AMD quote about ~150 cycles of total overhead, best case, for an INT n + IRET combo; hardware interrupts are unspecified)

Lazy invalidation is for multiprocessor environments: When a lazy invalidation is possible, a TLB shootdown IPI to other cores running the application can be avoided. In the pessimistic case (the page is accessed and the entry is stale), you have the same number of interrupts/exceptions to process; in the optimistic case, you don't have any.

Posted: **Thu Jul 28, 2011 7:38 am**

Kunalnitin wrote:Let us suppose a simple case, where a page-strucure entry is modified (r/w flag is flipped from 0 to 1) for a linear address and after that the corresponding TBL invalidation instruction is called immediately. My question is--as a consiquence of delayed invalidation of TLB s it possible that even after calling invalidation of TLB a write access to the linear address in question doesn't fault (page fault)?

If the R/W flag is 0 then the page is "read only", and if you change it to 1 the page becomes "read/write". If you invalidate immediately then the CPU will allow writes to the page immediately. If you don't invalidate immediately then the CPU may still think the page is "read only" when it isn't and generate a page fault if something writes to it. If the page fault handler checks what happened, realises the page fault was caused by a stale TLB entry, invalidates the TLB entry and then returns to the instruction that caused the page fault, then everything will appear to work as it should; but you've delayed the TLB invalidation until the page fault. Of course the translation might not be in the TLB when someone writes to the page, and you might not get the page fault at all (and might never actually invalidate the TLB entry because you didn't need to).

This is what Intel are talking about - delaying the TLB invalidation until the page fault occurs (and possibly delaying the TLB invalidation until "never" because it wasn't necessary in the end).

The same applies whenever you grant more access to a page. E.g. if you change a page from "not present" to "present", if you change it from "no execute" to "executable", etc. In all of these cases, if the CPU uses the stale TLB information it's "safe" as the page fault handler can correct the stale TLB entry.

However, if you do the opposite and change a page from "read/write" to "read only" without invalidating, then the CPU may think the page is writable when it isn't, and someone could write to the page when they shouldn't be able to. This can lead to problems and therefore the TLB invalidation should be done immediately (and shouldn't be delayed).

The same applies whenever you remove/restrict access to a page. E.g. if you change a page from "present" to "not present", if you change it from "executable" to "no execute", etc. In these cases, if the CPU uses the stale TLB information you won't get a page fault - it's "unsafe" and you should invalidate immediately.

For single-CPU, for most cases delaying the TLB invalidation is a waste of time - page faults are relatively expensive, and it's faster to just invalidate the page immediately. However, for multi-CPU, if the TLB needs to be invalidated in other CPUs then you can't just use "INVLPG" on one CPU - you have to do the full "multi-CPU TLB shootdown" thing, which involves sending an IPI to any/all CPUs that could be effected by the change. The "multi-CPU TLB shootdown" thing is very expensive (because of the IPI) and the more CPUs there are the worse it gets. This is where delaying the TLB invalidation (where possible) is important, because you avoid sending the IPI to any/all other CPUs. This is what is normally called "lazy TLB shootdown".

To summarise, each time you modify the paging structures (page table/s, page directory/s, etc), you'd:

invalidate the TLB on the current CPU
determine if the stale TLB entry could be used in an "unsafe" way:
- if "no", then other CPUs can rely on their page fault handlers to invalidate the TLB entries and you don't need to do anything else
- if "yes", then determine if any other CPUs could be effected by the change:
  - if "yes", then send an IPI to any/all other CPUs that could be effected, to ensure they invalidate their TLB too
  - if "no" (e.g. page belongs to a single-threaded process that can't be running on any other CPU) then you don't have to do anything else

[EDIT: Doh - seems Owen was much faster than me

]

Cheers,

Brendan

Posted: **Fri Jul 29, 2011 2:18 am**

Thanks Owen & Brendan for your explanations. It helped me in clearing my doubts. I was under impression that CPU/Intel-processor delays TLB invalidation even after the programmer has issued INVLPG instruction. It was my bad as I interpreted the manual wrongly.

OSDev.org

Can Intel processors delay TLB invalidations?

Can Intel processors delay TLB invalidations?

Re: Can Intel processors delay TLB invalidations?

Re: Can Intel processors delay TLB invalidations?

Re: Can Intel processors delay TLB invalidations?