Questions on memory management / buddy allocation

CodeCat · Post by **CodeCat** » Sat Oct 18, 2008 6:23 pm

I've gotten to the point where I want to add some basic memory management, so I've been reading up on all sorts of ways to implement things. The buddy algorithm seems to make the most sense to me, but I have a few questions.

I like to think of a system in terms of layers, so I envision that allocating memory would consist of a few layers where each layer calls the one below:

malloc/kmalloc etc. (handles management of the allocated pages, and would be implemented in user-space as well)
page mapping (handles page tables and mapping of physical pages to virtual ones)
page allocation (handles allocation of physical pages, which is where the buddy algorithm would go)

I'd like to know if this is a sensible division or if I'm missing something, or making things needlessly complicated?

Secondly, I noticed that the buddy algorithm allows you to specify the order of allocation size, where 2^order decides the amount of pages allocated. But since you can normally simply re-map noncontiguous physical pages to contiguous virtual ones, why would you ever need to use an order other than 0? For example if something asks for 16 kB, couldn't you just allocate 4 kB four times and use the page mapping to string it together inside the virtual address space rather than demanding that it's also contiguous physically?

cr2 · Post by **cr2** » Sun Oct 19, 2008 9:28 am

1. Nothin' is wrong with your layers.
2. It reduces external fragmentation.

CodeCat · Post by **CodeCat** » Mon Oct 20, 2008 4:47 am

Ok, fair enough. But what if you ask for a large block and there is no single contiguous block of that size, while there are smaller sizes? Maybe a fallback into mapping several smaller regions?

Also, I believe Linux has both the physically-contiguous kmalloc and the virtually-contiguous vmalloc. Is there a reason for having both, and when would you prefer using one over the other?

Brendan · Post by **Brendan** » Mon Oct 20, 2008 7:16 am

Hi,

CodeCat wrote:Ok, fair enough. But what if you ask for a large block and there is no single contiguous block of that size, while there are smaller sizes? Maybe a fallback into mapping several smaller regions?

The only reason you should need a large contiguous block is for some device drivers (e.g. floppy driver DMA buffer). Everything else (including most PCI devices, using "scatter-gather" bus mastering) only ever needs single pages. For normal processes, paging makes non-contiguous pages look like contiguous pages.

There is one minor issue - caches. In some cases caches work more efficiently if "contiguous" pages are used; but in this case the pages only need to look contiguous to the cache and don't actually need to be contiguous. For pages to look contiguous to the cache only some bits of the page's address need to be contiguous. For example, the addresses 0x00001000, 0x00002000 and 0x0003000 look contiguous, but so might the addresses 0x00001000, 0x12342000 and 0x43213000, because the cache might not care what the highest 16 bits of the address are.

To make things look contiguous to the cache, most good OSs implement "cache coloring"; but to do that properly you need to know how caches are configured (cache size and associativity is enough, which is information you can get from CPUID on all P6 and newer CPUs). Despite this, it's enough to simply have 256 cache colors without caring about how the caches are configured, because it won't matter much if you use more cache colors than you need to, and for the rare cases where 256 cache colors isn't enough you'll still get almost the same improvement (e.g. instead of getting a 5% performance boost with the right number of page colors you might get a 4.9% performance boost with 256 page colors instead).

I should point out that cache coloring makes the pages look contiguous in all cases (and doesn't get stuffed up by things like swapping individual pages to/from disk), while allocating a huge slab of RAM doesn't work in a lot of cases (e.g. when you're increasing the size of a previously allocated area, when you're using allocation on demand, when you've memory mapped a file, etc).

Note: AFAIK cache coloring is used by the Windows NT kernel (and derivatives), Solaris, FreeBSD, NetBSD and other OSs. AFAIK Linux is the only OS that uses a buddy allocator for physical memory. AFAIK the Windows NT kernel, Solaris, FreeBSD, NetBSD and other OSs were designed and then implemented. IMHO Linux was only implemented (and some things are just too hard to change once a large amount of code relies on it)...

CodeCat wrote:Also, I believe Linux has both the physically-contiguous kmalloc and the virtually-contiguous vmalloc. Is there a reason for having both, and when would you prefer using one over the other?

AFAIK "vmalloc()" maps non-contiguous physical pages into a contiguous area of the linear address space; while "kmalloc" maps contiguous physical pages into a contiguous are of the linear address space.

Cheers,

Brendan

CodeCat · Post by **CodeCat** » Mon Oct 20, 2008 8:20 am

Thanks for the explanation. I hadn't heard of cache colouring yet, so I'll give it a read.

Brendan wrote:AFAIK "vmalloc()" maps non-contiguous physical pages into a contiguous area of the linear address space; while "kmalloc" maps contiguous physical pages into a contiguous are of the linear address space.

That's what I said, but what I want to know is when you'd need to use one or the other. Most kernel stuff in Linux seems to use kmalloc, while vmalloc seems to be able to do the job fine in most cases too. So is it just a matter of people preferring kmalloc (with no benefit), or is there another reason they absolutely want contiguous physical pages?

Brendan · Post by **Brendan** » Mon Oct 20, 2008 9:35 am

Hi,

CodeCat wrote:Thanks for the explanation. I hadn't heard of cache colouring yet, so I'll give it a read.

Brendan wrote:AFAIK "vmalloc()" maps non-contiguous physical pages into a contiguous area of the linear address space; while "kmalloc" maps contiguous physical pages into a contiguous are of the linear address space.
That's what I said, but what I want to know is when you'd need to use one or the other. Most kernel stuff in Linux seems to use kmalloc, while vmalloc seems to be able to do the job fine in most cases too. So is it just a matter of people preferring kmalloc (with no benefit), or is there another reason they absolutely want contiguous physical pages?

IIRC, originally Linux was designed to map most of the physical address space into kernel space. In this case, it'd probably be faster for "kmalloc()" to return a pointer to an area within this mapping than to allow fragmented physical pages to be used (where you'd need to find somewhere in the kernel's linear address space to map the non-contiguous physical pages, and do TLB invalidation, etc). This would improve performance (especially on 80386 where there was no INVLPG instruction, and multi-CPU machines where you need to send an IPI to every CPU to do "TLB shootdown").

The problem with this would be physical memory fragmentation - if you want to allocate a large area then "kmalloc()" would probably puke, especially if the OS has been running for a while (physical memory would probably become fragmented to oblivion after running any decent workload for about 15 minutes).

Of course IMHO there's other problems with the "map the world into kernel space" approach - the physical address space can be larger than usable kernel space (e.g. a 32-bit computer with 1 GiB of kernel space and 8 GiB of RAM), it can suck from a fault tolerance/redundancy point of view, it can prevent some NUMA optimizations, and you couldn't send kernel data to swap space or use any other virtual memory tricks.

Cheers,

Brendan

OSDev.org

Questions on memory management / buddy allocation

Questions on memory management / buddy allocation

Re: Questions on memory management / buddy allocation

Re: Questions on memory management / buddy allocation

Re: Questions on memory management / buddy allocation

Re: Questions on memory management / buddy allocation

Re: Questions on memory management / buddy allocation