memory management with multiple cores

Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
z0rr0
Member
Member
Posts: 64
Joined: Sun Apr 03, 2005 11:00 pm
Location: Grenoble, France
Contact:

Re: memory management with multiple cores

Post by z0rr0 »

In my case, I am trying to understand if this model of memory can be used to speed up the execution of MPI applications by using an unikernel. Each core runs an instance of the MPI application. I presented a first PoC about this at https://fosdem.org/2023/schedule/event/ ... ernel_mpi/.
azblue
Member
Member
Posts: 147
Joined: Sat Feb 27, 2010 8:55 pm

Re: memory management with multiple cores

Post by azblue »

nexos wrote:No, that leads to asymmetric (i.e., your OS will be AMP not SMP) treatment of the CPUs, which will turn into a bottleneck. Instead, I would start out by having global memory management structures (i.e., your physical memory slab / bitmap / free list, and your virtual memory structures) that all CPUs work with equally, as if there was one CPU.

One catch though: these global structures will need to have locks. If you haven't looked it locking, now's the time to do it - it comes up everywhere is multi-processor systems.

The other option would be to have per-CPU memory management structures - that would be lockless (which is faster), but would be a pain to implement right, as memory is typically a global resource, not a per-CPU resource.

One great guide to memory management is on the wiki: https://wiki.osdev.org/Brendan%27s_Memo ... ment_Guide
One idea I had recently was to combine the two:
Each core allocates a few megabytes from the global memory structure (locks requires), then each core can allocate out of its local memory locklessly. The only time the slower, locked structures need to be accessed is when the local memory is used up.
nullplan
Member
Member
Posts: 1780
Joined: Wed Aug 30, 2017 8:24 am

Re: memory management with multiple cores

Post by nullplan »

azblue wrote:One idea I had recently was to combine the two:
Each core allocates a few megabytes from the global memory structure (locks requires), then each core can allocate out of its local memory locklessly. The only time the slower, locked structures need to be accessed is when the local memory is used up.
Sounds a bit like jemalloc. Essentially, you allocate a malloc arena for each thread separately. Only, in your case it would be for each CPU. OK, that it possible, but you have to be careful about throwing terms such as "lockless" around. Since a block allocated by one CPU may be freed by another, each block must record what arena it came from, and multiple CPUs can still access the same arena at the same time. And even on a single CPU, you must prevent interrupts and preemption while allocating memory, or else you can get invalid data structures for a long time. The lock is going to be rarely contended, and most modern lock implementations only issue a single atomic instruction in that case, but you do need a lock nonetheless.
Carpe diem!
Post Reply