OSDev.org

Posted: **Thu Nov 25, 2004 10:37 am**

I've been thinking about a technique for dynamic memory allocation in my kernel that is kind of interesting... I thought I would throw it out there to see if I'm not completely off my rocker. It depends on a few key assumptions:

We're talking about a really minimalistic microkernel -- no kernel-mode drivers or modules of any kind. This means all the code is written/maintained/known by one team (currently one person). It should also mean, in theory, that all control paths are well-known and behave predictably, especially in terms of memory usage.
There are two kinds of object lifetimes in such a microkernel -- those that must persist between kernel invocations and those that do not (examples below).
The kernel programming style used lends itself to small, transient, dynamically allocated objects (this is a big assumption that may be invalidated before I'm through).

Before explaining the technique, I'll give examples of what I mean by "object lifetimes". Examples of objects that must persist between kernel invocations include things like thread context info and page tables. Examples of transient objects that might only exist within a single kernel invocation include:

Dynamically allocated strings (rare, I know, but possible -- perhaps used when formatting output for debugging).
Polymorphic objects that encapsulate behaviour within the kernel that varies at run-time. It's hard to think of a concrete example right now -- I'm early in coding and haven't fully forseen the need for such things. I'll just throw out one possibility -- imagine something similar to std::ostream in C++ that could output to either the screen (vgastream?) or a serial port (serialstream?), depending on whether a remote kernel debugger is active.

It would be nice to think of an example that doesn't have to do with debugging, but I can't right now. Any takers?

Anyway, here's the idea: Each thread normally has a large contiguous block of memory for its kernel stack. This block is generally of a fixed size (unlike user-mode stacks, which are usually allowed to grow dynamically by allocating more pages). I'm generalizing here, but bear with me. In order to support the more "persistent" kernel objects, the kernel must have a more or less traditional heap manager (kmalloc/kfree). But, what if the chunk used for the kernel stack could also be used to allocate small temporary objects? A kind of thread-local fast-allocating mini-heap if you will. So for any given thread running in the kernel, its kernel stack would grow downward from the top and its "mini-heap" would grow up from the bottom. Then, when the thread is about to return to user-mode, the "top of heap" pointer is reset to the bottom and all the temporary objects are instantly deallocated (I'm assuming no need for "destructor" or "finalizer"-like constructs here). It's kind of like really cheap special-purpose garbage collection.

First, before the usual avalanche of responses from the seasoned veterans asking whether I'm insane

-- allow me to explain how I arrived at this idea. I'm implementing my kernel in C, but I'm designing it in a very OO manner. This is the way I design in "real life", so I want to see how applicable the techniques are in kernel-land. Part of the general pattern of OO implementations is a tendency towards many small dynamically-allocated objects. They are usually dynamically allocated because they are polymorphic and the decision of what their actual type should be is deferred to some factory object. They are usually small because most of the objects in a system are fine-grained "utility classes" that contain very little state (but perhaps a lot of polymorphic behaviour). I don't have enough of an idea yet about the forces that will shape the rest of my kernel's design to know if these traits will be common or not. They're certainly common in middleware, but that's a different world altogether...

So, I have three questions:

Does anyone think this is workable?
Does anyone think it is useful?
If so, can you think of any more example situations in which it would be useful?

Posted: **Thu Nov 25, 2004 11:41 am**

Transient objects that are not for debugging:

buffers for disk interfacing
packet buffers for all things that send packets (mouse, network, keyboard, etc)
IPI storage
stuff like that

the general point I think you are trying to make is that you thought up persistent objects, that remain after a reboot. Am I correct or are you implying something else?

Posted: **Thu Nov 25, 2004 12:56 pm**

Candy wrote: Transient objects that are not for debugging:

buffers for disk interfacing
packet buffers for all things that send packets (mouse, network, keyboard, etc)
IPI storage
stuff like that

The only one that might be relevant in my design would be IPI storage. Everything else would be handled by drivers in user-space.

the general point I think you are trying to make is that you thought up persistent objects, that remain after a reboot. Am I correct or are you implying something else?

No, I used the term "persistent" somewhat loosely. The more important point is that there may be some things you might need to dynamically allocate for a very short period of time (i.e. -- for the duration of a single syscall, interrupt, or exception), and for these, the technique I described above would potentially be very fast and cheap to implement (if it works under all cases).

Posted: **Thu Nov 25, 2004 1:08 pm**

Colonel Kernel wrote: No, I used the term "persistent" somewhat loosely. The more important point is that there may be some things you might need to dynamically allocate for a very short period of time (i.e. -- for the duration of a single syscall, interrupt, or exception), and for these, the technique I described above would potentially be very fast and cheap to implement (if it works under all cases).

For allocating small objects for a short period of time (during a single call, interrupt or exception) you can very well use the stack. Allocating from the stack on non-windows is order-1 in complexity, you substract what you want from ESP and it's yours.

Posted: **Thu Nov 25, 2004 3:33 pm**

Candy wrote: For allocating small objects for a short period of time (during a single call, interrupt or exception) you can very well use the stack. Allocating from the stack on non-windows is order-1 in complexity, you substract what you want from ESP and it's yours.

That's true, but it implies a certain programming model. Here's a user-space example... I'm still lacking a good kernel example, so this will have to suffice:

Code: Select all

void someFunc()
{
    // someFunc() decides what the types of foo and bar must
    // be -- not very flexible.
    FooSomething foo;
    BarSomething bar;

    foo.doSomething();
    bar.doSomething();
}


void someOtherFunc( int objType )
{
    // someOtherFunc() does not decide what the actual type
    // of the object is -- this is the caller's decision to make.
    ISomething* something = createSomething( objType );
    something->doSomething();
}

someOtherFunc() is more flexible than someFunc(), but the problem is that it requires heap allocation to create the right kind of object.

In C, there is another reason to favour "heap-like" allocation schemes -- it is easier to encapsulate the details of the structs you define.

Posted: **Fri Nov 26, 2004 4:26 pm**

Ok, I probably confused people, so here's my line of reasoning, in order:

I like OO design techniques, but I want to implement my ukernel in C, not C++ for reasons I've stated in other threads.
There are techniques for polymorphism and encapsulation in C, but they typically require objects to be allocated on the heap or statically, then referenced only by pointer.
Traditional heap management is expensive for small objects. I also don't want functionality that must be very reliable to have to rely on such a complex sub-system.
General-purpose garbage collection is too expensive to implement in the kernel and would be big overkill. This means traditional heap management ala malloc/free is still necessary, at least for objects that must live for a long time (i.e. -- between kernel entry/exits).
For short-term objects, it is possible to dynamically allocate them from a "mini-heap" in the same block of memory as the current thread's kernel stack, according to the scheme I've outlined above.
The possible problem with this scheme is that the "mini-heap" and the kernel stack share the same memory block, which is finite in size.
The finite size might be ok, given that we have a very small, closed system (ukernel with minimal features).

I hope this makes more sense. Comments...?

Posted: **Sat Nov 27, 2004 12:13 am**

I think Candy is trying to get across that you could expand the stackframe manually using inline assembly and then cast a pointer into that space as the object/struct, then you can pass that pointer on as a parameter to the next (sub-)function that needs it

Code: Select all

MyObj *SomeObj;
__asm__ volatile ("movl %%esp, %%eax": "=a"(SomeObj)); //Get pointer to top of stack frame
__asm__ volatile ("subl %%eax, %%esp": : "a"(sizeof(MyObj))); //expand the stack frame
InitMyObj(SomeObj); //Set up the struct, or call placement new

The above is theoretical, I'm really bad at AT&T syntax assembly. I also haven't quite got the grasp of how ESP relates to the last item on the stack [Points to it, or at the end of it?]

This seems to be exactly what you want, the auto cleanup added at the end of the function by GCC (movl %%ebp, %%esp) will automatically deallocate the object like garbage collection.

Posted: **Sat Nov 27, 2004 10:33 am**

if you swap the mov and the sub it should work...

Posted: **Sat Nov 27, 2004 2:59 pm**

AR wrote:This seems to be exactly what you want

Not really... First of all, the use of inline asm doesn't make the code any prettier.

Secondly, I don't want the object to be deallocated at function exit. If I wanted that, I'd just create it on the stack. What's the difference between your code and this?

Code: Select all

MyObj someObjStorage;
MyObj* SomeObj = &someObjStorage;
InitMyObj( SomeObj );

What I want is something that looks like this:

Code: Select all

MyObj* SomeObj = createMyObj();
...
// Somewhere else...
MyObj* createMyObj( void )
{
    MyObj* newObj = (MyObj*) kfastmalloc( sizeof( MyObj ) );
    ... // init fields of newObj
    return newObj;
}

...except that it's more efficient and simpler than regular kmalloc/kfree because:

There is no kfastfree() to be called -- all deallocation happens for you when the kernel returns to user space; and
It doesn't use the run-of-the-mill heap management algorithms, it just uses a stack-like structure instead, which is really, really fast.

If you're familiar with Mark() and Release() in Turbo Pascal, what I'm proposing is quite similar in spirit.

Posted: **Sun Nov 28, 2004 2:16 am**

Colonel Kernel wrote:

Code: Select all

MyObj someObjStorage;
MyObj* SomeObj = &someObjStorage;
InitMyObj( SomeObj );

What I want is something that looks like this:

Code: Select all

MyObj* SomeObj = createMyObj();
...
// Somewhere else...
MyObj* createMyObj( void )
{
    MyObj* newObj = (MyObj*) kfastmalloc( sizeof( MyObj ) );
    ... // init fields of newObj
    return newObj;
}

As for me being a c++ proponent, try

Code: Select all

MyObj SomeObj();
...
// Somewhere else...
MyObj::MyObj( void )
{
    // init fields of this, the pointer is called "this" here...
}

This is equivalent to what your code does, but has even less fluff, is allocated on the stack and automatically calls a destructor too (so you can use heap-allocated stuff etc. in the object, and be signalled when you go out of scope).

How is your solution better than this?

Posted: **Sun Nov 28, 2004 2:40 pm**

Candy wrote: As for me being a c++ proponent, try
Code: Select all
MyObj SomeObj();
...
// Somewhere else...
MyObj::MyObj( void )
{
    // init fields of this, the pointer is called "this" here...
}
This is equivalent to what your code does, but has even less fluff, is allocated on the stack and automatically calls a destructor too (so you can use heap-allocated stuff etc. in the object, and be signalled when you go out of scope).

How is your solution better than this?

One of the original constraints I placed on this problem was that you can't use C++... because I'm not using C++.

So, re-defining the problem does not solve it. This is actually kind of irrelevant, since the memory management technique I'm describing is equally applicable to C++.

Code: Select all

IObj* someObj = createObj();
...
// Somewhere else...
class MyObj : public IObj {...};

MyObj::MyObj()
{
    // init fields of this, the pointer is called "this" here...
}

// Somewhere else again...
IObj* createObj()
{
    // Maybe it creates a MyObj, maybe it creates some other
    // implementation of IObj. Either way, it must use new...
    return new MyObj();    
}

The point is, even your C++ example explicitly hard-codes the type that you want to instantiate. What if I wanted to use a factory (like above) to dynamically choose what class to instantiate? Then you need to use new. My original idea was a constrained but fast implementation of new. Does this make sense?

The design consideration I'm describing is not from kernel programming -- it's from other domains mainly. So, I'm not sure how useful this level of flexibility is in a kernel implementation. What I'm trying to get feedback on is whether this kind of design technique is useful, and whether the corresponding "fast malloc" idea is workable.

You're making me feel like either an escapee from an insane asylum or a broken record.

OSDev.org

Cheap kernel memory allocation

Cheap kernel memory allocation

Re:Cheap kernel memory allocation

Re:Cheap kernel memory allocation

Re:Cheap kernel memory allocation

Re:Cheap kernel memory allocation

Re:Cheap kernel memory allocation

Re:Cheap kernel memory allocation

Re:Cheap kernel memory allocation

Re:Cheap kernel memory allocation

Re:Cheap kernel memory allocation

Re:Cheap kernel memory allocation