Virtual memory management questions

Freanan · Post by **Freanan** » Fri Aug 26, 2005 12:49 am

I would like to start setting a virtual memory manager on
top of my physical one now. I read Tim Robinson's mm-tutorial pt 2 and vaguely understand the outlines of how it works, but there are heaps of open questions and confusion in my head.
Below there is a "list" of what i believe to understand, which questions attached, were they occured to me.

Chunk Allocation:
   Allocate space for a new chunk-list entry
      (how? Maybe have a list/stack of free memory
regions? How to keep this from being fragmented
when something is freed?)
   Attach it to the end of the chunk-list
      (how? maybe a linked list? Or make it a large
fixedsize array? Or stack-approach as in the
physical mm?)
   Obtain virtual offset of chunk
      (how? wouldn't i need a stack to keep track of free
virtual pages, adding a third mm?)
    Record virtual offset, size and access rights of
chunk in chunklist entry

Pagefault occurs:
   Search chunkentry corresponding to the faulting adress
   Allocate physical pages and map them to the memory that
belongs to the chunkentry. Make them present
   That's it?

Chunk Freeing:
   Unmap and free the physical memory corresponding to the
selected chunkentry
   Delete the chunkentry
      (How to prevent fragmentation and endless growing
of the list here -
      if someone frees an entry from the middle of the list,
how to effectively
      tell the system that this listentry can be re-used
again - and how to effectively re-use it?)

Please help me to understand this or point me to more detailled tutorials.
I tried to read something about allocation and deallocation but
it was turned by 90 degrees and so i refrained from reading it..

Pype.Clicker · Post by **Pype.Clicker** » Fri Aug 26, 2005 2:23 am

linux uses a binary tree to manage virtual region, and as far as i'm concerned, i found nothing more efficient than this -- except a _double_ binary tree (one using size as keys, and the other one using addresses as keys)

Oh, and about the "90 degrees" stuff, just open it with a tool like "ghostview": it can show you content on both landscape and portrait orientation ... or just print it

Solar · Post by **Solar** » Fri Aug 26, 2005 2:25 am

Or do violence to your monitor.

Freanan · Post by **Freanan** » Fri Aug 26, 2005 4:10 am

I just read abut a methof using linked lists to keep track of which memory areas are free...
Each memory area atarts with two dwords or something that indicate it's size and the starting adress of the next area.
I assume the binary tree version is somehow akin to that?

But now i am confused even more... What do i need a heap for then, if my allocated memory regions are actually located at some given virtual adress somewhere inside the adress space which does not have to be part of the memory that is assigned as heap?

But i understand some things better now (though my new understanding is quite different to what i read in the first tutorial): I will have one linked list (or tree or whatever) for each adress space to show what is free + one other kind of list to tell me what is allocated, so that i can map physical pages to allocated virtual areas, when a pagefault occures inside them?

Solar · Post by **Solar** » Fri Aug 26, 2005 4:37 am

Perhaps it gets clearer when you remember that (in Linux at least) [tt]malloc()[/tt] and [tt]free()[/tt] are user-space functions that operate on a memory pool that has already been requested from the kernel with [tt]sbrk()[/tt].

If you request more memory from malloc() than could be satisfied from that pool, malloc() calls the kernel to assign more memory pages to the address space.

If you release significant amounts of memory, free() calls the kernel to give back some pages to keep the (assigned but unused) pool small.

The idea, of course, is that you can ask for malloc( 4 ) without initiating a system call...

Freanan · Post by **Freanan** » Thu Sep 01, 2005 12:26 am

Okay, but i might need a kernel version of the malloc function later on anyway
and i will need one when i want to keep track of the allocated large-chunks from
my chunk allocator.
As far as i implemented the chunk-allocator, it uses a linked list for storing the free chunks,
but it does not keep track of allocated chunks, as Tim Robinson's tutorial suggests.
This might work out, if my pagefault handler only commits the single physical page in which
the adress causing the fault lies.
But if i want to commit the whole large-chunk to which the adress belongs, i need to keep
track of the chunks and safe them in an allocation list with entries of {offest;size;flags}
and then i will need kalloc to allocate space for new entries in that list - or did i get this wrong?

I also wonder what you actually mean by enlarging a processes adress space..
Doesn't each adress space span all 4gb (0r 2gb, if the upper 2gb are kernel space)
of virtual memory anyway - only that some of the pages are maked as non-present
(which will be handled by the pagefault-handler whenever the process tries to access those pages)?
I thought the chunk allocator was not used for enlarging an adress space, but for keeping track which parts of the 4 or 2 gb adress space are free for use...
(Edit:)But this task could actually be performed by the usertasks
themselves too - just like alloc and free.
So why do we bother to keep any virtual memory management
in the kernel atll (except the pagefault handler, which has to make sure that there actually is
physical memory attached to the needed parts of the 2gb of virtual memory)?

Freanan · Post by **Freanan** » Fri Sep 02, 2005 12:08 am

I have completed the part of virtual managent that it talked about above, letting the pagefault handler just commit one page per fault and eliminating the need of this allocation-list (though i don't know yet if this is a good decision).

Now i thought more about the things inside the "Edit" of my last post and it seems to me like a good approach.
But i would like to have some critics/advice/judgement about it:
Is it feasible to let each process access
all of it's non-kernel 2gb of virtual memory from the beginning on? The process could then do all memory management, except
the handling of page-faults and, of course the initial creation
of it's adress space, by itsself without any sbrk() or morecore()
system-calls. This would of course happen inside stdlib functions
like alloc(), so that the normal application programmer would not notice anything of it.

AR · Post by AR » Fri Sep 02, 2005 12:34 am

The process owns the whole 2GB allocated to it, allowing allocate on demand instead of sbrk() would probably allow for subtle bugs (pointer manipulation that goes horribly wrong but doesn't crash because wherever it ended up is demand allocated)

The sbrk/morecore stuff isn't that complicated, all it does is allocate/deallocate more pages (which the process cannot do itself - if it could then it could map in other program's address spaces or gain access to the kernel data).

In my opinion, demand allocation wouldn't really be any faster because:
Page fault > Kernel context switch > read fault address > map page > return
sbrk > Kernel context switch > map requested pages > return
The sbrk is more efficient as the page fault only informs the kernel of one page whereas sbrk can allocate a lot more than that.

Brendan · Post by **Brendan** » Fri Sep 02, 2005 2:01 am

Hi,

AR wrote:In my opinion, demand allocation wouldn't really be any faster because:
Page fault > Kernel context switch > read fault address > map page > return
sbrk > Kernel context switch > map requested pages > return
The sbrk is more efficient as the page fault only informs the kernel of one page whereas sbrk can allocate a lot more than that.

I think the main idea behind demand allocation is to prevent the allocation of RAM until it's necessary (rather than being fast), and to reduce the hassles of explicitly allocating.

Consider a ".bss section", where the program might have several large buffers that may or may not be used. For e.g.:

Code: Select all

char first_buffer[256*1024];
short second_buffer[256*1024];
void *third_buffer[256*1024];

Would you allocate 1796 KB of RAM when the program is started hoping that these buffers are used (and that the RAM won't be wasted)? I guess you could expect the programmer to use "malloc()" instead, but then (if the buffer is re-used) you might end up with something like:

Code: Select all

   if(address_of_first_buffer == NULL) {
      address_of_first_buffer = malloc(SIZE_OF_FIRST_BUFFER);
      if(address_of_first_buffer == NULL) {
         printf("ERROR: Failed to allocate...\n");
         exit(EXIT_FAILURE);
      }
   }
   store_something_in_first_buffer();

Instead of just:

Code: Select all

   store_something_in_first_buffer();

The other problem is that the buffers may be partially used. Imagine something like:

Code: Select all

   while( (c = getc(file)) != EOF) {
      c = getchar();
      first_buffer[c] = 1;
      second_buffer[c]++;
      third_buffer[c] = address_of_file_name;
   }

For small files or ASCII data, each buffer would be used a little bit, but there'd be a huge amount of RAM wasted.

Then there's programmers like me, who consider "malloc" to be optional. For the prototype 80x86 emulator I was playing with I allocated memory with code like this:

Code: Select all

            absolute 0x40000000
VMCPUCACHE_data:            ;Cache line data (max 1 GB)
            absolute 0x80000000
VMCPUTRANS_firstBlock:            ;Translated instruction block data (max 512 MB)
            absolute 0xA0000000
VMCPU_messageBufferAddress:   resd 1

The maximum amount of RAM used for each buffer depended on a "configuration message" sent from a different thread, and the amount of RAM actually used depended on the code being emulated.

I guess what I'm saying is that the application programmer should be able to decide to use demand allocation or not, or to use malloc or not, and perhaps be able to specify which areas of the address space are used for either and be able to combine both or use neither.

IMHO it'd also be nice if the program's header could be used to determine if the ".text" and/or ".data" sections should be memory mapped or not, and if the ".bss section" should be allocated on demand or pre-allocated.

Of course it would be traditional for all of these options to be hidden from HLL programmers by the required startup library, so that people like me can laugh at them.

Cheers,

Brendan

JoeKayzA · Post by **JoeKayzA** » Fri Sep 02, 2005 2:01 am

Hi!

You could incorporate a system call through which a userspace app can query and set the 'allocation status' of any page in it's address space (by passing the page base address). Everything that has to do with heap management should then go into userspace. So when a page does not belong to your heap at all, you set it's status to 'unmapped', and the kernel will raise a critical page fault when you try to access it. When the page belongs to an allocated heap region, you set its status to 'allocate on demand', then the kernel will allocate it on page fault.

The advantage over the 'sbrk()' method is, that the allocated heap space does not need to be coherent, when there is a _big_ hole in the middle (> PAGE_SIZE), you can easily free it, and thus improve fault detection, because you can't accidentaly access it anymore without raising a fault

.

cheers Joe

Brendan · Post by **Brendan** » Fri Sep 02, 2005 2:52 am

Hi,

JoeKayzA wrote: You could incorporate a system call through which a userspace app can query and set the 'allocation status' of any page in it's address space (by passing the page base address). Everything that has to do with heap management should then go into userspace. So when a page does not belong to your heap at all, you set it's status to 'unmapped', and the kernel will raise a critical page fault when you try to access it. When the page belongs to an allocated heap region, you set its status to 'allocate on demand', then the kernel will allocate it on page fault.

The advantage over the 'sbrk()' method is, that the allocated heap space does not need to be coherent, when there is a _big_ hole in the middle (> PAGE_SIZE), you can easily free it, and thus improve fault detection, because you can't accidentaly access it anymore without raising a fault .

This is similar to what my OS does...

For my OS, there's a flag in program's header that determines if demand allocation should be used for the entire address space or not. If it's not, then any areas can be set to demand allocation with a system call. The heap manager (malloc/free) has to be initialized before it's used (ie. you need to tell it which area of the address space to manage) with something like "status = init_heap(heap_start_address, heap_end_address)".

I don't use "sbrk()", but allocate pages for heap space when the heap grows (up to "heap_end_address") without it, unless the entire address space is using demand allocation where this isn't necessary. The difference is that "sbrk" can be changed at any time (up to some maximum), while "heap_end_address" can only be set once and never changed after that.

The heap manager itself has 2 versions of "malloc()" - "malloc_AOD()" for allocating heap space that uses demand allocation (RAM may or may not be pre-allocated), and "malloc_RAM()" for allocating heap space that definately has RAM pre-allocated. Both of these versions of malloc work the same regardless of whether allocation on demand is enabled for the entire address space or not (with the flag in program's header).

The "free()" function always releases RAM pages back to the system when possible to minimize allocated physical RAM.

IMHO this scheme works well, but it's not perfect either. The "free()" function should be more intelligent, keeping some pages instead of releasing them to reduce the number of times the kernel is needed (and a "free_as_much_RAM_as_possible" function for when the kernel is about to run out of free physical RAM).

Cheers,

Brendan

Freanan · Post by **Freanan** » Fri Sep 02, 2005 2:53 am

Okay, i understand the problems:
The process should not be able to map any memory by itsself, so it either needs
demand allocation (on pagefault) or a system call (sbrk) to get parts of it's
virtual memory mapped and set to present (on allocation).
Demand allocation might, as you said, be slow because of the
many pagefaults that it causes.
Sbrk/Morecore would eliminate those faults by "pre-mapping"
the needed memory on allocation.

But wouldn't a system call, called "commit" for example, which
is only responsible for the mapping/setting to present,
but not (like traditional sbrk/morecore) for the actual allocation/accounting business, also
solve the pagefaulting-problems of demand paging?

The userlevel mm-functions could take care of which parts
of virtual memory are free or in use, but use commit(virt_adress); to get something mapped to the virtual adress that it allocated (so that no fault is caused when the memory is referenced).

Brendan · Post by **Brendan** » Fri Sep 02, 2005 4:38 am

Hi,

Freanan wrote: But wouldn't a system call, called "commit" for example, which
is only responsible for the mapping/setting to present,
but not (like traditional sbrk/morecore) for the actual allocation/accounting business, also
solve the pagefaulting-problems of demand paging?

First, forget about heap management for now (I get to it further down).

Using a system call like "commit" to allocate N pages at once would be faster than using allocation on demand (with N page faults) if every page allocated is actually used, because there's much less overhead.

Using a system call like "commit" to allocate one page at a time wouldn't be faster than using allocation on demand, because they'd both be doing roughly the same thing. In this case, allocation on demand would be easier to write software for because you don't need to write the code to allocate each page, but "commit" would catch more bugs (a messed up pointer would cause a page fault rather than an allocation).

Using a system call like "commit" to allocate N pages at once would waste RAM compared to allocation on demand if all pages aren't actually used. For example, if only one page is actually used then "N - 1" are wasted. Because of this "commit" can also increases the chance that the OS will run out of RAM and swap space will be needed (in this case "commit", or the additional swap space, can have a horrible effect on performance). Even if swap space doesn't become needed it'd still effect the amount of RAM that's left over for things like file system caches (wasting RAM can effect performance in other ways).

Therefore, for areas that are actually used it's better for performance to use "commit" (pre-allocate RAM), but for areas where some pages might not actually be used it's probably better to use allocation on demand.

Now, how can the kernel/OS determine which pages an application may or may not actually use in advance? It can't - often the application doesn't even know (compare "gcc --help" to "gcc myHugeProject"). The only way to get the best "efficiency" (ie. the best compromise between wasted RAM and speed) is to allow the application to control how pages are allocated for each area at run time.

This means you'd need a linear memory manager that allows the application to specify the "page types" used for each area.

For everything above, the term "application" actually means the entire application, including any libraries that it uses. For example, "allow the application to control how pages are allocated" actually means "allow the application and/or the libraries that it uses to control how pages are allocated".

As for the heap manager (malloc(), realloc(), free(), etc), it should be built on top of the linear memory manager (typically as part of a standard library in user space), and should use the facilities provided by the linear memory manager. For functions like brk() and sbrk(), implement them on top of the linear memory manager too (if they're needed), just like the heap memory manager itself.

This allows different heap managers to be used that are "tuned" to the application, or heap managers that support extra features. It also allows other memory management things to be implemented (like "obstacks"), or for the application to do it's own memory management without any of these - e.g. using the linear memory manager's interface directly without any "assistance" (which can be important for highly tuned applications like a large databases).

Cheers,

Brendan

Freanan · Post by **Freanan** » Sun Sep 04, 2005 4:28 am

OK, so as it seems my idea to use commit() instead of sbrk/morecore might not be that bad at all, because it enables an application to either commit ("pre-map") the memory that it needs (in case it already knows that it will need it, or in case it wants to catch the subtle bugs that may be yielded by demand allocation)
or not to commit it and just use the demand allocatiom mechanism (in case the fine grained allocation would fit better and it does not care for the subtly bugs because it is stable code).
Thanks for your explanations!

Crazed123 · Post by **Crazed123** » Sun Sep 04, 2005 8:37 pm

Pype.Clicker wrote: linux uses a binary tree to manage virtual region, and as far as i'm concerned, i found nothing more efficient than this -- except a _double_ binary tree (one using size as keys, and the other one using addresses as keys).

I've got a question about that binary search tree thing... I've done some reading on algorithms to determine when to swap out pages, and which to swap out, but they seem to require linear linked lists rather than trees. Other than threading a linked list through your tree (I'm doing that, but it seems rather inefficient to add a new node that way.), what can be done to use say... an aging algorithm on a tree of pages?

OSDev.org

Virtual memory management questions

Virtual memory management questions

Re:Virtual memory management questions

Re:Virtual memory management questions

Re:Virtual memory management questions

Re:Virtual memory management questions

Re:Virtual memory management questions

Re:Virtual memory management questions

Re:Virtual memory management questions

Re:Virtual memory management questions

Re:Virtual memory management questions

Re:Virtual memory management questions

Re:Virtual memory management questions

Re:Virtual memory management questions

Re:Virtual memory management questions

Re:Virtual memory management questions