Secondary Level address translation (SLAT) TLB structure

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
cianfa72
Member
Member
Posts: 73
Joined: Sat Dec 22, 2012 12:01 pm

Secondary Level address translation (SLAT) TLB structure

Post by cianfa72 »

Hi,

hope this is the right place to ask for the following (same question has been posted on other forums)

Consider a processor/cpu with support for Secondary Level address translation (SLAT) technology (Intel EPT/AMD RVI). TLB caching is used to improve translation performance especially when SLAT is enabled.

Now consider the case in which guest page size is different from host page size (just to fix ideas suppose guest page size is 4KB while (SLAT) host page size is 2MB). In this scenario to support guest virtual -> host physical translation I guess TLB entries need to cache both levels of translation: guest virtual -> guest physical (GVA->GPA) and guest physical -> host physical (GPA->HPA). Otherwise, using different page sizes, how can be implemented the lookup in TLB to match the direct GVA->HPA mapping?

thanks.
stlw
Member
Member
Posts: 357
Joined: Fri Apr 04, 2008 6:43 am
Contact:

Re: Secondary Level address translation (SLAT) TLB structure

Post by stlw »

The processor TLB doesn't keep GVA->GPA mappings ever, it holds only GVA->HPA mappings.
The final page size is determined as min of two, i.e. in your example the final translation will be saved in min(2M,4K)=4K TLB entry.

The page miss handler (which serves TLB misses and does the actual page walks) also has its own intermediate walk level caches like PML4, PDPTE and PDE.
There is separate cache for regular and EPT translations so PML4 caches actually keeps GVA->guest PML4 mapping while EPML4 cache keeps GPA->EPT PML4 mapping.
These are only translation caches which keep mappings not to HPA directly and as you can see page size is not relevant for these caches.

Stanislav
cianfa72
Member
Member
Posts: 73
Joined: Sat Dec 22, 2012 12:01 pm

Re: Secondary Level address translation (SLAT) TLB structure

Post by cianfa72 »

Thanks Stanislav

my question was related to the benefit in term of less frequent TLB miss when employing host level large page (2MB) to back guest OS page.

In the previous example (4KB and 2MB guest and host page size respectively) considering that the number of available entries in TLB is fixed, I do not see any benefit. If I understand correctly we can have a benefit (from TLB miss point of view) just when guest OS and hypervisor (VMM) use the same page size (in the example just when guest OS employs itself large pages - 2MB)

Does it sound correct ?
stlw
Member
Member
Posts: 357
Joined: Fri Apr 04, 2008 6:43 am
Contact:

Re: Secondary Level address translation (SLAT) TLB structure

Post by stlw »

If speaking about TLB only - you are right.
But in case of TLB miss large page walk is 1-level shorter and this also would have some benefits.

BTW, some observation too - amount of large page entries cached in the TLB is much smaller usually.
For example on early Core processors large pages were not cached in the 2nd level TLB which is much bigger than 1st level.
So if accesses are sparse enough and don't utilize locality of 2M page - it would be even better to use 4K pages also for TLB capacity reasons.

Stanislav
cianfa72
Member
Member
Posts: 73
Joined: Sat Dec 22, 2012 12:01 pm

Re: Secondary Level address translation (SLAT) TLB structure

Post by cianfa72 »

just to make sure I got it right :roll:
stlw wrote:If speaking about TLB only - you are right.
But in case of TLB miss large page walk is 1-level shorter and this also would have some benefits.
the benefits here - due to the 1-level shorter page table walk - apply to each of the two translation level (GVA->GPA and GPA->HPA), right ?
stlw wrote: BTW, some observation too - amount of large page entries cached in the TLB is much smaller usually.
For example on early Core processors large pages were not cached in the 2nd level TLB which is much bigger than 1st level.
So if accesses are sparse enough and don't utilize locality of 2M page - it would be even better to use 4K pages also for TLB capacity reasons.
here -in conditions of sparse enough memory accesses - the benefit is due just to the higher number of 4KB TLB entries available on 1nd and 2nd level TLB (as opposed when min (guest page size, host page size) = 2 MB), right ?

thanks
stlw
Member
Member
Posts: 357
Joined: Fri Apr 04, 2008 6:43 am
Contact:

Re: Secondary Level address translation (SLAT) TLB structure

Post by stlw »

stlw wrote:If speaking about TLB only - you are right.
But in case of TLB miss large page walk is 1-level shorter and this also would have some benefits.
the benefits here - due to the 1-level shorter page table walk - apply to each of the two translation level (GVA->GPA and GPA->HPA), right ?

Of course depends there do you have 2Mb pages.
stlw wrote: BTW, some observation too - amount of large page entries cached in the TLB is much smaller usually.
For example on early Core processors large pages were not cached in the 2nd level TLB which is much bigger than 1st level.
So if accesses are sparse enough and don't utilize locality of 2M page - it would be even better to use 4K pages also for TLB capacity reasons.
here -in conditions of sparse enough memory accesses - the benefit is due just to the higher number of 4KB TLB entries available on 1nd and 2nd level TLB (as opposed when min (guest page size, host page size) = 2 MB), right ?

Exactly.
stlw
Member
Member
Posts: 357
Joined: Fri Apr 04, 2008 6:43 am
Contact:

Re: Secondary Level address translation (SLAT) TLB structure

Post by stlw »

stlw wrote:
stlw wrote:If speaking about TLB only - you are right.
But in case of TLB miss large page walk is 1-level shorter and this also would have some benefits.
the benefits here - due to the 1-level shorter page table walk - apply to each of the two translation level (GVA->GPA and GPA->HPA), right ?
Of course depends where do you have 2Mb pages.
stlw wrote: BTW, some observation too - amount of large page entries cached in the TLB is much smaller usually.
For example on early Core processors large pages were not cached in the 2nd level TLB which is much bigger than 1st level.
So if accesses are sparse enough and don't utilize locality of 2M page - it would be even better to use 4K pages also for TLB capacity reasons.
here -in conditions of sparse enough memory accesses - the benefit is due just to the higher number of 4KB TLB entries available on 1nd and 2nd level TLB (as opposed when min (guest page size, host page size) = 2 MB), right ?
Exactly.
Post Reply