Page 1 of 1

Kernel Memory allocation - Conceptual Issues

Posted: Wed Oct 04, 2006 11:59 pm
by elaverick
I'm having some conceptual problems with memory management in the kernel.
I've been looking through a number of people's sources to see how they've tackled the problem and although there is obviously no one way in which to implement things. I have noticed that all the sources I have seen so far have a seperate paging system for user land programs and heap allocator for kernel space. Why is this?
It would seem obvious to me that if you have paging set up (and for higher half kernels this is of course a necessity) you would be as well using that for the kernel and for the users programs. However as it's clear that this is not "the done thing" there must be something I'm missing.

So, why do we keep a kernel heap and the page tables seperate? Why not simply request a page or two for the kernel and leave them around for it to play with (presumably they would keep the same memory protection as is available to Userland programs).

Assuming that paging is not the way to go with this, I'm thinking of going with either a doubly linked list or possible a binary tree (with memory chucks sorted by size). Is there a reason to favour one method over another?

The problem with the openess of design choices for Kernel dev seems to be that most everything comes down to personal choice. It might be nice if the Wiki could include tables with pros & cons comparisons of a few methods. Just a thought.

Re:Kernel Memory allocation - Conceptual Issues

Posted: Thu Oct 05, 2006 1:37 am
by octavio
You don?t allocate 4KB if just need 12 bytes, thats the purpose of malloc wich uses the heap. Page allocation
is related to virtual memory management, a program can have
a memory space of 4GB used for the heap but only 40KB of physical ram in use when the program tries to use more ,then the system allocates more physical ram using the paging mechanism.
For heap linked list is slow and not balanced trees too.
I use various lists sorted by block size, but there are a lot
of other algoritms ,google for malloc.

Re:Kernel Memory allocation - Conceptual Issues

Posted: Thu Oct 05, 2006 2:01 am
by elaverick
Ok, but I understood that paging is generally set up to take the entire 4GB of potential space. If the heap allocator doesn't touch the paging tables isn't there a risk that the paging algorithm will attempt to allocate physical memory that is in use by the heap?
I was thinking that you might allocate two or three pages and then sub-divide them with the kernel heap, but that doesn't doesn't seem to be the approach anyone else takes to this.

Re:Kernel Memory allocation - Conceptual Issues

Posted: Thu Oct 05, 2006 9:24 am
by Colonel Kernel
I think you might be confused about what others are really doing...
elaverick wrote:Ok, but I understood that paging is generally set up to take the entire 4GB of potential space. If the heap allocator doesn't touch the paging tables isn't there a risk that the paging algorithm will attempt to allocate physical memory that is in use by the heap?
No, because the kernel heap allocator should not allocate physical memory -- that's the job of the physical memory manager.
I was thinking that you might allocate two or three pages and then sub-divide them with the kernel heap, but that doesn't doesn't seem to be the approach anyone else takes to this.
It sounds to me like what most kernels do, except that I think they use more than two or three pages.

This is the way it works, at least in my design: Kernel heap manager and user-space memory-related system calls both use the virtual memory manager to allocate and free pages. The virtual memory manager uses the physical memory manager to allocate physical memory for those pages, and also for the page tables themselves. In my case, the kernel heap manager always gets pages mapped to physical memory, while user-space allocations might not.

Other microkernels I know of (Minix, BCOS, and possibly L4) don't have a kernel heap manager at all. Instead they have arrays of structures for each kernel resource (thread control blocks, IPC messages, etc.). I'm not sure about Minix and L4, but I believe BCOS uses sparse arrays by only allocating physical memory for those pages of the arrays that are actually in use (Brendan, please correct me if I got this wrong...). If your microkernel is simple enough, this technique might work for you.

Re:Kernel Memory allocation - Conceptual Issues

Posted: Thu Oct 05, 2006 2:46 pm
by elaverick
Ok so am I right in thinking the heap allocator grabs a bunch of pages and then subdivides them, then the kernel can allocate these chunks off the heap for its own internal uses?

Secondly the paging system is just a mapping device correct? And not actually a physical allocator (bearing in mind that it can also represent pages that exist virtually on hard drives etc) If that's right then I think I need to see a physical allocator to get an idea how that works.

Could anyone point me in the right direction for those (assuming I've not got this completely round my neck just yet :) )

Re:Kernel Memory allocation - Conceptual Issues

Posted: Fri Oct 06, 2006 2:01 am
by Brendan
Hi,
elaverick wrote:Could anyone point me in the right direction for those (assuming I've not got this completely round my neck just yet :) )
In general there's 3 completely different "levels".

At the lowest level is the "physical memory manager", which keeps track of free physical pages of RAM and nothing else. It only ever works on multiples of 4 KB (it can give you 4 KB of RAM or 8 KB of RAM, but not 100 bytes), and doesn't really care what the RAM is used for. Often this is built into the kernel. For a very simple system the physical memory manager might have 2 functions that are never used directly (e.g. by applications, etc):
  • void *allocPhysPage();
    void freePhysPage(void *page);
On top of this there's a "linear memory manager" or "virtual memory manager", which uses the physical memory manager to allocate and free physical pages of RAM. It uses these pages to create and maintain page directories, page tables, etc, and also handles things like swap space, memory mapped files and other "tricks" (shared memory, allocation on demand, etc). It also only ever works on multiples of 4 KB. Often this is built into the kernel. For a very simple system the linear/virtual memory manager might have 2 functions that applications, etc can use:
  • int allocPage(void *address);
    void freePage(void *address);
In addition, it'd provide some functions intended for the rest of the kernel to use:
  • int createAddressSpace(void);
    int destroyAddressSpace(void *addressSpace);
Lastly there's something to manage which parts of an address space are being used for what. This is the "heap manager" (and might include "objstacks", garbage collection, etc). The heap manager allows you to ask for any number of bytes of space and free it later. Often this isn't built into the kernel, but is implemented in a high level language library (e.g. malloc(), free(), new(), delete(), etc). This allows different applications to have completely different heap management code and reduces the number of times the application needs to call the kernel.

For "simple" UNIX style memory management, you'd have an application (with code, data, bss) and you'd have a "top of address space" variable. When the application calls the malloc() or new() library function, the library will check to see if there's any unused space below the "top of address space" variable, and if there isn't enough space it'd increase "top of address space" variable and ask the kernel's linear/virtual memory manager to map more pages of RAM between the old top of address space and the new top of address space.

Now imagine you're writing a simple application to count the number of times the word "the" occurs in a file. In this case you could use "malloc()" to dynamically allocate a file buffer, or you could statically allocate this space (for e.g. "char *fileBuffer[BUFFER_SIZE]"). If all of the space you need to use is statically allocated (e.g. in the ".bss" section) then you'd never need functions like "malloc()", "free()", etc and you could write your application without any heap management at all.

Next, imagine a larger application with no heap management - all of the space it could want is statically allocated in the ".bss" section. This should work, but it's a waste of good RAM (especially if parts of the .bss aren't actually used sometimes). To fix this, your linear/virtual memory manager could use "allocation on demand". For example, the entire ".bss" section could be marked as "not present" in the page tables. When the application tries to access a page for the first time it'd generate a page fault. The page fault handler could allocate a page and insert it at the address that was accessed, and then return to the application. In this case the application would think it's entire ".bss" section has RAM when it doesn't.

The problem with this is that the application might use some RAM once (e.g. a 1 MB buffer) and never use it again. With statically allocated space there isn't a good way to free this RAM after it's been used.

[continued]

Re:Kernel Memory allocation - Conceptual Issues

Posted: Fri Oct 06, 2006 2:12 am
by Brendan
[continued]

Now let's look at a micro-kernel. Just like an application it needs something to manage the memory it uses. The kernel would still rely on the physical memory manager and the linear/virtual memory manager. The kernel might also use some sort of "heap" management to dynamically allocate space, but it could statically allocate space too (just like the application using it's ".bss"). In this case you have the same problems as the application (wasting RAM). Allocation on demand fixes half of this, so the only real concern is freeing RAM when it's not needed anymore. For an application this can be painful, but a micro-kernel is a little different.

If you look at the type of data a micro-kernel needs to store, you'll notice that there really isn't much that ever needs to be freed, and often you'll find that even when something isn't being used it's better to keep the RAM in the kernel's space ready for next time it's needed (to save the overhead of re-allocating later). For anything that's left over the kernel can free the pages manually. The only real problem here is that things need to be aligned on 4 KB boundaries so that things can be freed without messing up other things. This isn't much of a problem - most things take up one page anyway or can be done in groups (2 things per page, 4 things per page, etc.

For an example, imagine some kernel code that keeps track of some details for each task, stored in a "Thread Data Structure" (or TDS for short) and assume that each thread has a 4 KB kernel stack. With heap management you might do something like:

Code: Select all

void *allocTDS(void) {
    void *stack;
    TDSstruct *TDS;

    stack = kalloc( 4096 );
    if( stack == NULL ) return NULL;
    TDS = kalloc( sizeof( TDSstruct ) );
    if( TDS == NULL ) {
        kfree( stack );
        return NULL;
    }
    TDS->kernelStack = stack;
    return tds;
}

void freeTDS(TDSstruct *TDS) {
    kfree( TDS->kernelStack );
    kfree( TDS );
}
With static allocation, you might make sure that each TDS is 4 KB in size and each thread might have a unique "thread ID". In this case you'd use the thread ID as an index into statically allocated arrays. For e.g.:

Code: Select all

TDStruct TDStable[MAX_THREADS];
unsigned char *KernelStackTable[4096][MAX_THREADS];
In this case, the address of a thread's TDS is "&TDStable[ threadID ]" and the address it uses for it's kernel stack is "&KernelStackTable[ 0 ] [ threadID ]". It's much faster than dynamic allocation (i.e. searching for a free area to use), and if you want you could still free the RAM with something like:

Code: Select all

   freePage(&TDStable[threadID]);
   freePage(&KernelStackTable[0][threadID]);
In general using statically allocated space in a micro-kernel gives the kernel more control over when RAM is allocated and freed, is faster and can be simpler to implement. For monolithic kernels things get more complicated because you have to worry about which device drivers are being used and what they need. Because of this you can't really statically allocate space for everything in a monolithic kernel, and need to provide heap management (although you could probably still use a mixture of statically allocated and dynamically allocated).

Hope this made sense...


Cheers,

Brendan

Re:Kernel Memory allocation - Conceptual Issues

Posted: Fri Oct 06, 2006 1:20 pm
by elaverick
Thanks for the explaination, it's one of the clearest things I've read on the topic in the last few days.
Just so I'm sure I've got this right... what you're suggesting is that due to the permenance of the data structures you find in a micro-kernel (and because a micro-kernel is so centred on the generic jobs and doesn't stray into the realms of device drivers) you are often better using static allocation as it will be faster because you don't have to go searching for the areas of memory you need in run time.

Also if I've got this right, you're saying that if you keep your data structures at 4kb they will naturally fill a page of memory and because we are able to acertain the address at which that page starts from the variable you can still free this page in the usual manner for the paging system to re-use in run time if you find that it is needed/desirable.

Does this mean then that if you have statically allocated data structures that are smaller the 4kb alignment (say a single char for example) that this will be expanded to take the full 4kb of a page essentially wasting 4095 bytes of memory? Or is the compiler clever enough to realise and only 4kb align variables above a certain threshold?

Re:Kernel Memory allocation - Conceptual Issues

Posted: Sat Oct 07, 2006 1:57 am
by Brendan
Hi,
elaverick wrote:Just so I'm sure I've got this right... what you're suggesting is that due to the permenance of the data structures you find in a micro-kernel (and because a micro-kernel is so centred on the generic jobs and doesn't stray into the realms of device drivers) you are often better using static allocation as it will be faster because you don't have to go searching for the areas of memory you need in run time.
For me and my OS design, the permanance of the data structures, the additional control the kernel gets over when allocation and de-allocation of pages occurs, and the performance improvements it can give for allocations, make static allocation the best method for me. Of course this doesn't necessarily apply to other people and/or other OS designs.... ;)
elaverick wrote:Does this mean then that if you have statically allocated data structures that are smaller the 4kb alignment (say a single char for example) that this will be expanded to take the full 4kb of a page essentially wasting 4095 bytes of memory? Or is the compiler clever enough to realise and only 4kb align variables above a certain threshold?
The compiler itself doesn't know anything about paging or memory management. All it does is put things where the programmer tells it to put things. Sometimes the compiler puts things on the stack (function arguments and most local variables), sometimes it puts things in the ".data" section ("char foo = 'A';"), sometimes it puts things in the ".bss" section ("char foo;") and sometimes it puts things at an address it knows nothing about ("*((char *)0xB8000) = 'A'" or "*pointer = 'A'").

For a single byte, the compiler will store it where-ever you tell it to store it. You could tell the compiler to use an entire 4 KB page, but you'd hopefully design the kernel so that this isn't necessary.

It's probably a little hard to understand how statically allocated space can work without having a good understanding of the types of data a micro-kernel needs to store and how this data can be arranged to suit statically allocated space. As an example, here's how I typically do a micro-kernel's memory management:

[tt]Size Key Description
4 KB !,A Kernel API table, 4 bytes for up to 1024 kernel API functions
4 KB !,k System information block, misc. data structure containing global data
4 KB !,P Kernel panic data page
4 KB !,p Page allocation map, 1 byte for each page below 16 MB
4 KB !,p Free page stacks, 8 bytes per free page stack for memory between 16 MB and 4 GB
8 KB !,p Free page stacks, 16 bytes per free page stack for memory above 4 GB
1024 KB @,k CPU data table, 4 KB for up to 256 CPUs
256 KB #,i I/O port owner table, 4 bytes per I/O port
128 KB #,I IRQ receiver table, a list of thread IDs that receive each IRQ (up to 256 IRQs)
256 KB #,s Process data table, one page for each process' data (up to 65536 processes)
512 KB #,s Thread data table, one page for each thread's data (up to 131072 threads)
512 KB #,s Thread state table, one 4 byte entry for each thread's state (up to 131072 threads)
512 KB #,L Page directory table, one page directory for each thread (up to 131072 threads)
64 KB #,l Kernel log (one big ASCIIZ string)

Index
! All RAM in the area is allocated during boot and never freed
@ Only needed RAM in the area is allocated during boot and never freed
# RAM is allocated on demand and unused pages may be freed at any time
k Area is used by the kernel API
k Area is used by the kernel for a variety of things
p Area is used by the physical memory manager
i Area is used by the I/O port security code
I Area is used by the IRQ handling code
s Area is used by the scheduler
L Area is used by the linear memory manager
l Area is used by the kernel log
P Area is used by the kernel's critical error handler[/tt]

[continued]

Re:Kernel Memory allocation - Conceptual Issues

Posted: Sat Oct 07, 2006 2:01 am
by Brendan
[continued]

There's a few things missing from this list. The linear memory manager uses the highest 4 MB of every address space to map paging data structures. Also, for IPC I use message queues and a large area is used for this.

Each message's data is variable length (between 4 and 1024 bytes long). Each thread has a few variables in it's "thread data table" entry which points to the head and tail of it's message queue. When a message is added to the queue the kernel looks at the queue head and decides if the message will fit in an existing page. If the message will fit it stores the message in the existing page and changes the "head" variable. If it doesn't fit, the kernel gets a new page, puts a link to the new page in the previous page, stores the message in the new page and then changes the "head" variable. The memory management for this is specially designed. If a "messaging page" becomes unused it's added to a stack of messaging pages. If a new messaging page is needed it checks this stack and if the stack is empty it allocates a new page. If the kernel is running out of free RAM, it can free pages that are on the stack of messaging pages. The end result is that it keeps track of unused pages but doesn't free them unless it needs to, which reduces the overhead of allocating and freeing pages for message queue data.

For "allocation on demand", the page fault handler only does allocation on demand in kernel space if the address is within a specific zone. All of the kernel's data that uses allocation on demand is within this zone. This means if the kernel tries to touch a "not present" page in the message queue area or anywhere else outside of "allocation on demand zone" it will cause a critical error (or kernel panic).

I use assembly, and have an included file that defines which part of the kernel's address space is used for what (e.g. "%define SMEMkernelAPITable 0xFEEDB000", "%define SMEMsystemInfoBlock 0xFEEDA000", etc). In C you could use a linker script instead, or just use the kernel's ".bss" section. For example (GCC):

Code: Select all

void *SMEMkernelAPITable[1024] __attribute__ ((aligned (4096)));
struct SIB SMEMsystemInfoBlock __attribute__ ((aligned (4096)));
It might take some fiddling to get it right though...

I should also admit that my most recent kernel doesn't do this because the kernel itself is made up of "kernel modules", where each kernel module is a seperate binary and is meant to be replaceable (for e.g. so that you can use a completely different scheduler with it's own completely different data structures without recompiling anything). Because of this I'm no longer using purely statically allocated space for everything. Instead I've got a "top of kernel space" variable, and (during boot) each kernel module decreases this variable to reserve space for it's data structures. It's still very similar to statically allocated space though.

Cheers,

Brendan