Page 1 of 1
Paging again
Posted: Sun Nov 27, 2005 4:53 am
by B.E
The last two days i have been looking at paging more. there are just somethings that i need to clarify:
1 The goal of memory management is to have not many page faults per second.
2 The TLB is a cache of the page tables recently access. If the page table is not currently in the TLB, the requires more cycles to access the page.
3 When you context switch you change CR3 to the new processes address space. But in the processor manual 3, page 3-37 it says:
All of the (non-global) TLBs are automatically invalidated any time the CR3 register is loaded
(unless the G flag for a page or page-table entry is set, as describe later in this section).
if all the above is true, does this mean that every task switch will invaildate the TLB and therefore slow the whole system down. if so what is the point of the TLB.
Re:Paging again
Posted: Sun Nov 27, 2005 5:45 am
by falconfx
1) True/False. I knew it isn't a MM's goal; I think you should limit page faults but not avoid them
2) True. Because of that the CPU uses the TLB
3) True. At every CR3 change, the CPU clears the TLB and reload the new page directory. Though, I think on nowadays CPUs it's quite null the slow down
I hope this will help you. Though, I might be wrong: everyone's not perfect
Re:Paging again
Posted: Sun Nov 27, 2005 6:53 am
by Brendan
Hi,
B.E wrote:1 The goal of memory management is to have not many page faults per second.
I think you mean one of the goals of memory management is to be efficient while supporting desired features. For example, to reduce the number of page faults while still providing features like swap space, rather than minimizing page faults by not implementing these features.
B.E wrote:2 The TLB is a cache of the page tables recently access. If the page table is not currently in the TLB, the requires more cycles to access the page.
Yes.
B.E wrote:3 When you context switch you change CR3 to the new processes address space. But in the processor manual 3, page 3-37 it says:
All of the (non-global) TLBs are automatically invalidated any time the CR3 register is loaded
(unless the G flag for a page or page-table entry is set, as describe later in this section).
if all the above is true, does this mean that every task switch will invaildate the TLB and therefore slow the whole system down. if so what is the point of the TLB.
When you do a context switch, all of the old TLB entries will be for pages that aren't used in the new context. Therefore it makes no difference if they are invalidated or not.
The only exception to this are pages that are the same in all address spaces (typically the kernel itself). If the TLB entries for these pages are invalidated it would effect performance because these TLB entries can be re-used by the CPU. That's why these pages are marked as "global", so that their TLB entries aren't flushed.
It's also why you should use the INVLPG instruction to flush specific (changed) TLB entries in your linear memory manager rather than reloading CR3.
Cheers,
Brendan
Re:Paging again
Posted: Mon Nov 28, 2005 4:49 am
by Pype.Clicker
falconfx wrote:
Though, I think on nowadays CPUs it's quite null the slow down
Well, afaik, the slowdown due to a TLB flush is still perceptible: every page that is referrenced right after the switch will require a central memory lookup (most likely, unless your L1 or L2 cache catch this, which i wouldn't count on since they're probably full of code and data from the old task).
And memory access is still the major bottleneck on our machines :'(
That's precisely why "global pages" (that are expected to have the same mappings in every address space) and address-specific TLB invalidation have been added.
Re:Paging again
Posted: Mon Nov 28, 2005 5:37 pm
by B.E
the problem with the INVLPG is the fact that you have to invalidate every page every time. I was reading the Optimization manual and it says:
The TLB miss results in a performance degradation since another
memory access must be performed (assuming that the ranslation is not already present in the processor caches) to update the TLB. The TLB can be preloaded with the page table entry for the next desired page by accessing (or touching) an address in that page. This is similar to prefetch, but instead of a data cache line the page table entry is being loaded in advance of its use. This helps to ensure that the page table entry is resident in the TLB and that the prefetch happens as requested subsequently.
so you could just reload the CR3, "touch" the code, data and stack pages. Make the kernel pages global. Which would be more quicker then maunaly invaildating the pages one by one.