Journey to a Memory Manager

AaronMiller · Post by **AaronMiller** » Sun Mar 16, 2008 3:57 pm

Hullo all.

The following is for paging.

I've been trying to work on a memory manager for a while now (Most of the work has been simply gathering various bits of information on the matter and looking into multitasking (For consideration purposes)). Anyways, I've seen mention of the following methods to allocate pages:

* Flat Bitmap (Search linearly through an array of DWORDs for free page)
* Super Bitmap (Search linearly through a shorthand array of DWORDs for a free page area, then search that page area for a free page)
* Stack (From my understanding, make a fixed array of free pages, pop a page off the stack to get the free page, push on to mark as done with (I don't quite understand this entirely [note-a]))
* Linked List (Keep track of a linked list of free/used pages, write/read from the list, etc [note-b])

Question 1
I have the idea that I'd use a combination of the stack and a super bitmap, is this a better idea than to use a super bitmap alone?

[note-a]: What happens if I need more room in the stack? I can't simply reallocate it -- That's the reason I need a memory manager! I can't just set its array size to all of the possible free pages (That'd make a pretty big kernel).

[note-b]: Linked Lists are very memory sensitive by their nature. Problems which I can view as occuring from using "dynamic memory to manage dynamic memory" are stability, efficiency, and final quality output. Stability because you might think it's solid, but there could be one bug hiding that you never knew of. Then when you think you're done and go on to say the file system, multi-tasking, or device management you find something wrong and then debug the wrong system -- not very good. Efficiency because you'll have to keep an array of the all the list items (Unsorted of course, this is just for "book-keeping"). What happens if that array is used up? See [note-a] for more details on what I mean. Final quality output because of the problems with trying to make it efficient. You'll likely lose performance to keep memory from being fragmented. That, or gain a larger overhead.

The following is for a malloc/free.

The language I'm working on (Aex.Uni Basic) uses a pool system to keep track of allocated strings (ASCII and Unicode). The pool system has several structures in it:
(Main)
* A pool chunk (8-bit unsigned byte tells number of free, 8-bit unsigned byte tells number of used, array of 32-bit void pointers are the slots)
* A pool (Has various characteristics and manages all the various pool chunks)

(Creation)
* A pool descriptor (Describes a constructor to be called when memory is allocated, a destructor for when it's freed, and the default size for allocated data (For strings, 512 characters (Bytes used may be either 512 or 1024 depending on ASCII or Unicode)))

I plan on rewriting the pool system to make it more efficient before I release it to the public for free, however in the terms of a memory manager I figured "Hey, a pool system wouldn't be that bad!" and then I set out on designing a pool system which could be used as a memory manager. I'm not sure if I could call the final result of what I came up with a pool system, but I have this:

Code: Select all

// A single allocation (6-bytes)
struct sAllocation
{
	u16				size;				// How big is the allocation? 0 if not allocated
	void*			ptr;				// Pointer to the allocation
};
// A region of allocations (1537-bytes)
struct sRegion
{
	u8				numFree;			// How many free allocations are there? If you do 256-numFree you get how many are used.
	sAllocation		allocs[256];		// The allocation slots
};
// A region pool designed to fit within one page (3075-bytes - 1 page = 4096 bytes)
struct sPool_4kb
{
	u8				numFree;			// How many regions are free?
	sRegion			regions[2];			// Can fit 512 allocations
};
// A region pool designed to fit within two pages (7686-bytes - 2 pages = 8192 bytes)
struct sPool_8kb
{
	u8				numFree;			// How many regions are free?
	sRegion			regions[5];			// Can fit 1280 allocations
};
// A region pool designed to fit within four pages (15371-bytes - 4 pages = 16384 bytes)
struct sPool_16kb
{
	u8				numFree;			// How many regions are free?
	sRegion			regions[10];		// Can fit 2560 allocations
};
// A region pool designed to fit within eight pages (32278-bytes - 8 pages = 32768 bytes)
struct sPool_32kb
{
	u8				numFree;			// How many regions are free?
	sRegion			regions[21];		// Can fit 5376 allocations
};
// The memory manager designed to fit in one page
struct sMemoryManager_4kb
{
	sPool_4kb		pool;				// The memory manager pool
	// Fill in with other attributes
};
// The memory manager designed to fit in two pages
struct sMemoryManager_8kb
{
	sPool_8kb		pool;				// The memory manager pool
	// Fill in with other attributes
};
// The memory manager designed to fit in four pages
struct sMemoryManager_16kb
{
	sPool_16kb		pool;				// The memory manager pool
	// Fill in with other attributes
};
// The memory manager designed to fit in eight pages
struct sMemoryManager_32kb
{
	sPool_32kb		pool;				// The memory manager pool
	// Fill in with other attributes
};

I could probably add a few more attributes to the sRegion structure and sPool structure. I have free space left over, so I *should* be able to make searching through those pools MUCH more efficient. The design goal here is simply to be able to allocate and deallocate memory PER application. This isn't intended as over the WHOLE of the OS (The number of allocations available would be pretty low if it was!).

I'd simply allocate the various number of pages PER application. i'm deciding how many pages to give per application though. Since device drivers and the kernel will share a memory manager, I'll be giving the kernel 8 pages:

Code: Select all

typedef sMemoryManager_32kb	sKernelMemoryManger;

I'm not sure if the absolute maximum of 5376 allocations is enough for today's programs. I'm generally pretty sure some may even reach 65536 allocations! So:

Question 2
Is there a better memory management system I could implement? Is there any way to make this one better (In terms of maximum number of allocations -- speed isn't a concern as I have a few ideas to improve speed)?

I've "formated" this post to make it easier to read through (eg: bold, italic, and underlining various sections rather than a large clump of plain text).

Thanks in advance, your input (Including criticisms and random notes) will help.

Cheers,

-naota

Paw · Post by **Paw** » Sun Mar 16, 2008 5:14 pm

[note-a]: What happens if I need more room in the stack? I can't simply reallocate it -- That's the reason I need a memory manager! I can't just set its array size to all of the possible free pages (That'd make a pretty big kernel).

I'm just going to tackle this one, because I'm currently trying to wrap my head around a couple of memory management problems as well.
In my kernel I created a stack-based physical memory manager which creates its entries (physical page addresses) right after the kernel. This memory position can be retrieved through a linker script (excerpt):

Code: Select all

SECTIONS {
  // Some sections...
  end = .;
}

The end-symbol can be referenced from your C(++) code via:

Code: Select all

extern void* end;

void stack_setup() {
    unsigned int* stack_offset = (unsigned int*) &end;
    // fill stack...
}

That way you don't have to include a prepared stack into your kernel. You cannot know the present physical memory before-hand anyway...

The stack-based approach works for physical memory management, because, as far as I understand, the MMU is able to map continuous virtual memory addresses to arbitrary physical memory addresses in any order, so you don't have to keep track of physcial memory sequences for most cases.

The only problem with the stack approach occurs if you want to allocate continuous physical memory for DMA purposes (correct me if I'm telling garbage here

Combuster · Post by **Combuster** » Sun Mar 16, 2008 5:45 pm

AaronMiller wrote:Question 1
I have the idea that I'd use a combination of the stack and a super bitmap, is this a better idea than to use a super bitmap alone?

The advantage of a bitmap is that it's easy to determine if a page is in use or free, but it potentially requires to browse the entire bitmap to find a free page.
With a stack, you can find a free page by popping the top item, and add free memory by adding items onto the stack. However to determine if a page is allocated or not, you will have to traverse all items in the stack.

If you use both, you can allocate free memory in fixed time:
- pop a page off the stack
- update that position in the bitmap
and free:
- push a page onto the stack
- update the bitmap
and determine allocated state:
- check the bit in the bitmap.

If you expect both operations to occur frequently, then the combination would be a good choice. If you don't expect or allow a probe wether a piece of memory is allocated, then not having a bitmap saves you space and execution time.

[note-a]: What happens if I need more room in the stack? I can't simply reallocate it -- That's the reason I need a memory manager! I can't just set its array size to all of the possible free pages (That'd make a pretty big kernel).

- With a stack, you need 4K to maintain 4MB of memory, in other words, 0.1% of available memory. As the amount of memory in use grows, the stack will shrink, allowing you to use the pages that have run empty on entries. When you free memory and the stack is full, make that bit of memory part of the stack, essentially letting the memory manager handle itself.

[note-b]: Linked Lists are very memory sensitive by their nature. Problems which I can view as occuring from using "dynamic memory to manage dynamic memory" are stability, efficiency, and final quality output.

Stability: That's what unit testing is for. Even in a simple memory manager there may be latent bugs.
Efficiency: Start with measuring the algorithmic complexity of the algorithm itself, and look for algorithms with a better complexity if its unacceptable. The bitmap and the stack aren't the only alternatives.
Output: Fragmentation only becomes a problem when you are dealing with DMA buffers. For normal usage, paging allows fragmented pages of memory to reappear consecutive.

(...)

I'm not sure if the absolute maximum of 5376 allocations is enough for today's programs. I'm generally pretty sure some may even reach 65536 allocations! So:

Question 2
Is there a better memory management system I could implement? Is there any way to make this one better (In terms of maximum number of allocations -- speed isn't a concern as I have a few ideas to improve speed)?

Hardcoding a maximum size is usually a bad thing. Since you mentioned basic, I've written a few programs that passed over that limit very easily. Also, I have declared several arrays that are over 64k in size (read: several MB). So while a pool may be good for a specific case, in the average case it wouldn't suffice at all.

You may want to look up on DLMalloc (which is pretty much the standard for malloc/free calls)

AaronMiller · Post by **AaronMiller** » Thu Mar 20, 2008 6:07 pm

Thanks all.

After posting I went home and built a linked list program. For performance reasons I had it allocate a lot of memory and realised Windows could handle a lot of memory. I allocated 0x15000 chunks of memory in under 300 milliseconds with my program, so there must be an "unlimited" way, as I can see from your posts.

I recently realised if I allocate new pages, I can use them to manage an "infinite" list of memory slots. I'll look up the "DLmalloc" and friends as mentioned.

Cheers,

-naota

Brendan · Post by **Brendan** » Thu Mar 20, 2008 7:14 pm

Hi,

A few quick notes...

1) I always use a bitmap to manage some RAM (so that I can allocate contiguous pages for DMA/bus-mastering), but for most of RAM I use free page stacks (so that I can allocate/free pages fast). For example, for a computer with 512 MB of RAM I'd manage the lowest 16 MB with a bitmap and the remaining 496 MB with free page stacks. I only allocate single pages from the bitmap if there aren't any free pages on the free page stacks (so allocations are almost always fast). This does mean that I can't allocate contiguous pages from most of RAM, but that doesn't matter because there's very few reasons to need contiguous pages.

2) For free page stacks, you can store the management data in the free pages themselves. For example, the first dword in each free page can contain the physical address of the next free page on the stack. This costs you a total of 4 bytes per free page stack (for the physical address of the first free page on the stack) and avoids the need to use a variable sized area of the linear address space.

Cheers,

Brendan

Ready4Dis · Post by **Ready4Dis** » Fri Mar 21, 2008 6:15 am

Brendan wrote:Hi,

A few quick notes...

1) I always use a bitmap to manage some RAM (so that I can allocate contiguous pages for DMA/bus-mastering), but for most of RAM I use free page stacks (so that I can allocate/free pages fast). For example, for a computer with 512 MB of RAM I'd manage the lowest 16 MB with a bitmap and the remaining 496 MB with free page stacks. I only allocate single pages from the bitmap if there aren't any free pages on the free page stacks (so allocations are almost always fast). This does mean that I can't allocate contiguous pages from most of RAM, but that doesn't matter because there's very few reasons to need contiguous pages.

2) For free page stacks, you can store the management data in the free pages themselves. For example, the first dword in each free page can contain the physical address of the next free page on the stack. This costs you a total of 4 bytes per free page stack (for the physical address of the first free page on the stack) and avoids the need to use a variable sized area of the linear address space.

Cheers,

Brendan

I pretty much do the same thing as Brendan, although I only keep a bitmap for the first 1mb of ram for legacy devices to share. I use a linked list of free pages, with the info stored in the free page itself ( so it wastes NO memory at all like a bitmap would).

The kernel has a variable called NextFreePage, which points to the last page in the list. So, when an allocate page request comes in, it goes like so:

Code: Select all

void *PhysicalPageAlloc(void)
{
   BaseSize Addr = NextFreePage;
   Virtual2Physical(0x1000, Addr); //Map it into a random free space, can be anywhere
   NextFreePage = *(BaseSize*)0x1000;  //Read the info to the previous page
   UnmapVirtual(0x1000);  //Optional, i reserve a specific virtual address just for this, so I can simply leave it mapped
   return Addr;  //Return the address
}

And for freeing:

Code: Select all

void PhysicalPageFree(void *ptr)
{
   Virtual2Physical(0x1000, (BaseSize)ptr);
   *(BaseSize*)ptr = FirstFreePage;  //Store the previous pointer in the page we are freeing
   FirstFreePage = (BaseSize)ptr;  //Set this variable to point to this newest free page
}

The cool thing is, you can create this list by calling PhysicalPageFree for all the pages in physical memory that you want to add. (Very similar to a stack, but like a list also because it stores links). You don't need to pre-allocate any memory for a table of any sort, or worry about a list running out of space, since you are using memory that isn't being used, and once it's used can be overwritten and it doesn't matter

. This method is much better than the other approaches because it doesn't matter the amount of memory, and doesn't need any memory to work. Once it runs out, NextFreePage woudl be NULL (which you'd want to check for to return NULL so you don't map in addr 0x0000 and trying writing over the real-mode IVT). I won't get into malloc/free, this post is big enough, and there are plenty of examples around for implementing those (JamesM has a pretty good intro/tut on a simple allocator), or Doug Lea's code base is a good starting point if you want something mostly pre-built with lots of testing behind it (it's used in the standard C lib for most distro's of linux I beleive).

AaronMiller · Post by **AaronMiller** » Sat Mar 22, 2008 3:10 pm

I have my bitmap setup along side a super bitmap - it wastes only 500k maybe:

Code: Select all

u32 g_pageUsageBitmap[32768];
u32 g_superPageBitmap[32];

I have it set up so AT MOST I will go through 96 iterations by taking advantage of the 32-bit processors abilities by comparing a whole 32-bits at a time. The super page bitmap keeps track of rather or not 32 pages are free or not, per bit. So I can skip whole sections at once.

So I guess page allocation isn't that much of a problem for me. I wanted to design a file system early on though (For swap files later in the memory management). The only thing I have to worry about now is malloc/free. And for the first release I guess I can just calculate how many pages to allocate per each malloc/free as that'd be sufficient for an early use. However, later on I'll have to make it a bit better (Add in garbage collection as well -- It's not that difficult, I've already built garbage collectors).

I thank you all.

I guess the only real question I have now is "What do you do for your malloc/free?".

Cheers,

-naota

OSDev.org

Journey to a Memory Manager

Journey to a Memory Manager

Re: Journey to a Memory Manager

Re: Journey to a Memory Manager

Re: Journey to a Memory Manager