OSDev.org

Posted: **Wed Dec 05, 2007 2:37 pm**

I have read several tutorials on memory management and i understand the different principles, but none of them seem to touch on allocation of very small pieces of memory. I've easily setup page clarity allocation, but i need to dynamically allocate and free very small pieces (a few bytes) and i can't think of a way that doesn't use too much memory. Also the pieces are often of different sizes. It doesn't matter if it takes a while to allocate or free the small pieces, but it must just be fast to access, no transversing through linked lists or similar. Sorry if i missed something vital and this is a trivial question.

Posted: **Wed Dec 05, 2007 3:21 pm**

What you want is a malloc implementation, like this: http://g.oswego.edu/dl/html/malloc.html

Posted: **Wed Dec 05, 2007 4:34 pm**

Thanx. I had a quick look over that link and it seems to be what i was looking for. Not entirely sure, but i'll give it a good look over first, either way i'm sure it'll help.

Posted: **Wed Dec 05, 2007 10:56 pm**

I don't think you should worry about over optimizing malloc for this situation.

You have a couple different sizes of memory allocation:

large blocks -
medium blocks
small blocks

malloc() and friends are best with medium blocks. When discussing small blocks you probably want to try one of:

- using the stack and keeping it temporary
- using static variables
- slab allocator

It really sounds like what you want here is a slab allocator. see -
http://en.wikipedia.org/wiki/Slab_allocation

You should really work out a good malloc implementation first though. Even the "fast" malloc implementations are fairly slow (or very wasteful). My general train of thought when kernel programming.

Memory allocations are very expensive in kernel land. They can also be extremely dangerous. Think about this scenario:

You run out of physical memory, and need to swap, so you start the swap routine, which talks to the block layer, but the block layer needs to allocate memory before it can complete the write, so you can't swap out to disk...

A malloc() call in kernel land is a very different animal than in user space, treat it carefully and use it with respect, or it will bite you...

Edit: missed the reply, if you haven't implemented malloc, you'll need to do that first before thinking about slab allocation - but realize that you might not want to spend too much time optimizing for "small memory" allocations.

Posted: **Thu Dec 06, 2007 2:38 am**

I wrote a tutorial about this. It uses a modified (simplified) version of Doug Lea's malloc (used in glibc).

http://www.jamesmolloy.co.uk/tutorial_h ... 0Heap.html

Let me know if you find it useful.

Fate:

- using static variables

ARRRGHHHHH!!! NOOO! Don't say that! Just don't!

Plus I had a look at the slab wiki page (because I was interested) but it looks extremely poorly written and doesn't explain much. It seems to be written in lawyer-style "this must/ this shall" speech and is rather difficult to read.

/me greps for a better one.

EDIT: /me finds one!

http://citeseer.ist.psu.edu/compress/0/ ... k94slab.ps

Posted: **Thu Dec 06, 2007 5:42 am**

The true reason i need this working is for my pipes ipc. I need a table to store pointers to the pipes but i don't quite know how to structure it so that i can issue pipe ids to processes which can quickly be looked up to a pipe buffer pointer. I can do the rest. I can't have a static table as that would need too much memory to allow for the system i want to use.

I was thinking of allocating small blocks but now i just realised it won't work as i'm gonna have to link them like a linked list and that would be too slow.

I can't just make the pipe id the pointer as i need to somehow be sure that is really a pipe and not some other protected memory. I need to also be able to free the pipe if the process terminates.

Posted: **Thu Dec 06, 2007 6:04 am**

You want an implementation of a dictionary. A dictionary in the computer science sense is any data structure that can store key/value pairs and look up the value given a key.

It sounds like you want a dictionary keyed by pipe id giving a pointer back.

Common implementations of dictionaries:

* Linked list: Simple, but O(n) in the average case (linear lookup time).
* Array: Simple, and O(1) (constant lookup time) for time, but O(n) for space.
* Search tree: Several algorithms around, look for red-black trees or AVL trees. Much more complex, and well beyond what you need, they generally provide ~O(log2 n) time functions for lookup, insert and remove.
* Hashtable: This is what I would recommend - it's half way between a linked list and an array (when using chained hashing, which is generally used more than probing).

Hashtables

A hashtable consists of a number of "buckets". Initially these are NULL, but when items are added they become linked lists. So essentially you have an array of linked lists.

The array is statically sized - it doesn't get increased when more items are added (this is not always true - if many items are added it may be worthwhile resizing the array to improve lookup times).

When inserting or looking up an item the item is assigned a bucket by use of a hash function, hash(key). The linked list at that bucket is then traversed / added to.

Interestingly, hashtables can perform as arrays or as linked lists depending on the hash function: if it always hashes every key to the same bucket you will end up with one long linked list (with O(n) time lookups/inserts).
If it always hashes every key to a different bucket and there are less keys than the number of buckets you end up with an array (albeit with one level of indirection).

So it is VERY important to get the hash function correct - an ideal hash function will distribute the set of possible keys perfectly evenly over the available buckets. Normal hash functions look something like:

Code: Select all

#define HASH(k) k%size

Which should do a half-decent job. Also interesting to note is that hashtables work much better (with this sort of hash function) if the number of buckets is prime. I'll leave you to work out yourself why that is (hint: it uses a modulo (%)).

Summary

I think a hashtable is what you want

Posted: **Thu Dec 06, 2007 6:45 am**

Yes that looks good. It would mean that i can dynamically expand to it, but it won't take too long to transverse and it also won't take up loads of memory like an array would when nothing is allocated. That actually solves a lot of problems, thanx.

I just want to check that i've got the right concept. Let's say the table has 4 entries and i try access the pointer with id of 9. Then it would be the third element of the second linked list if nothing has been freed?

This would work very well as each pipe is allocated at least one page. This stores the pipe info and the buffer. These would now also just include a pointer to the next pipe. And then i have no need to allocate small blocks of memory.

Posted: **Thu Dec 06, 2007 7:46 am**

liquid.silver wrote: I just want to check that i've got the right concept. Let's say the table has 4 entries and i try access the pointer with id of 9. Then it would be the third element of the second linked list if nothing has been freed?

Very close: you were right about everything but the "third element" part. *if* you ordered the linked lists by key then yes, that would be correct. But most hashtable implementations leave them unordered, so you would have to try and traverse that entire linked list before you gave up and decided it wasn't there. (I.e. it could be anywhere in the second linked list.)

Posted: **Thu Dec 06, 2007 8:30 am**

Okay fair enough. But it's easy enough to order them and it would allow early exits in the algorithm. I want to favour quick lookups over setup times, so i will probably order them.

I briefly read through your tutorials and i really liked them. My os is only in assembly, and i have already passed most of the subjects, but i liked the way in which you explained the subjects. Keep it up

Posted: **Thu Dec 06, 2007 1:50 pm**

JamesM wrote:I wrote a tutorial about this. It uses a modified (simplified) version of Doug Lea's malloc (used in glibc).

My malloc is also based on the ideas of dlmalloc, though I keep a single freelist allocating first-fit, essentially always split, always coalesce if possible, never attempt to give memory back to the system. It runs fast and well enough that I've been able to LD_PRELOAD it on an xterm, then start ViM from that xterm, then work on my kernel some hours (doing all compiles from ViM) and never notice much of a difference with the default Linux malloc.

"time make clean all" for my current kernel tree, running some default version of glibc on my old linux box runs about 8.5sec realtime, 7.5sec user time. The same thing with LD_PRELOAD my current version of malloc, runs about 0.3 seconds slower (about 7.8 sec usertime). So full build (with gcc) of my OS tree using my own malloc is about 4% slower than the default (counting usertime ofcourse). [edit]After posting this I thought of adding -O3 to my GCC commandline, and the difference degenerated into 0.1 seconds or less.. nevermind..

I'd be curious to hear how yours performs in comparison to some reference..

[edit] Oh forgot to mention the malloc is available somewhere on this forum, though I could just as well attach the latest version in to this post.. here goes.. not a threadsafe version unless compiled for my kernel... to compile a shared library for Linux, use:

Code: Select all

$ gcc -o libmalloc.so -shared malloc.c

OSDev.org

Small Memory Management

Small Memory Management