LRU on SMP memory manager

nexos · Post by **nexos** » Sat Jul 24, 2021 6:05 am

Hello,
I have been planning out my memory manager, but I had one question come to mind. I know many memory managers use the "clock algorithm", which is really approximate LRU. But, one question I have is how does this work on SMP? For example, if the clock hand changes the referenced bit, wouldn't that require a shootdown? So my question is, how do modern systems implement page replacement algorithms?
Thanks,
nexos

thewrongchristian · Post by **thewrongchristian** » Sat Jul 24, 2021 8:31 am

nexos wrote:Hello,
I have been planning out my memory manager, but I had one question come to mind. I know many memory managers use the "clock algorithm", which is really approximate LRU. But, one question I have is how does this work on SMP? For example, if the clock hand changes the referenced bit, wouldn't that require a shootdown? So my question is, how do modern systems implement page replacement algorithms?
Thanks,
nexos

The same way as UP, but with TLB shoot downs as required, as you say. How a page mapping is changed to clear a reference bit is very platform specific, so the implementation details are hidden behind an abstraction, but the process of scanning memory for reference bits to clear should be platform agnostic, as the referenced bit should be maintained in a central page frame database. [1]

In clock, the scan rate generally tends to be proportional to memory pressure. No memory pressure, very little scanning, and so little overhead from resulting TLB shoot downs. Under high memory pressure, you'll notice thrashing overhead more than TLB shoot down overhead, so it is worth a little bit of pain to avoid thrashing.

Of course, this very much depends on your abstractions. The SVR4 and BSD VMM have a much cleaner separation between platform independent and platform specific code, than say, Linux. Linux is more like how 4 BSD prior to 4.4 expected platform specific memory management to look VAX like, and interpret the VAX like page tables onto the local CPU. [2]

Sun threw that out for SunOS (originally 4.1 BSD based?), and replaced it with a VM subsystem that used a much more abstract interface to the MMU using HAT (Hardware Address Translation) API, and was eventually subsumed into SVR4.

4.4 BSD also threw that out, replacing it with the Mach VM, again with a much more abstract interface to the MMU using the pmap API.

But in both the SunOS/SVR4 and BSD/MachVM cases, the page scanning is very much done in the platform independent portion of the VMM, and both have since scaled quite happily SMP wise.

[1] Of course, how the reference bit makes it from the platform MMU to the platform independent page frame database is the crux of the problem. My own personal code doesn't export that information from the hardware reference bit, and instead is filled in by software in response to page faults. Thus, no reference bit is required in hardware, so I should be fine when I port to VAX

[2] Linux and Windows provide an idealised multi-level page table in which to communicate mappings, including maintenance of reference information. I'm not sure if Linux/Windows scans these page table structures directly for reference information, or whether that information is also mirrored to the central page database.

nexos · Post by **nexos** » Sat Jul 24, 2021 8:51 am

Ok thanks! I guess I'll be looking into lazy shootdown algorithms

Korona · Post by **Korona** » Sun Jul 25, 2021 5:39 am

The clock algorithm (at least the one described on Wikipedia) requires a reverse map (from physical to virtual addresses) though, if the CPU has virtually-mapped access bits. That sounds quite inconvenient. How do OSes that use such an algorithm typically solve this issue?

thewrongchristian · Post by **thewrongchristian** » Tue Jul 27, 2021 1:44 pm

Korona wrote:The clock algorithm (at least the one described on Wikipedia) requires a reverse map (from physical to virtual addresses) though, if the CPU has virtually-mapped access bits. That sounds quite inconvenient. How do OSes that use such an algorithm typically solve this issue?

MachVM/BSD provides pmap_is_referenced, which takes a vm_page structure.

Rather than cycling through the physical page database, the BSDs link the vm_page objects onto LRU queues, so it doesn't look like even the BSDs require a reverse map. Forward mapping is enough.

I may be reading it wrong though, I was going from the NetBSD code here:

uvmpdpol_selectvictim

OSDev.org

LRU on SMP memory manager

LRU on SMP memory manager

Re: LRU on SMP memory manager

Re: LRU on SMP memory manager

Re: LRU on SMP memory manager

Re: LRU on SMP memory manager