Malloc query

vipin_cpp · Post by **vipin_cpp** » Sun Aug 19, 2007 11:06 am

Guys i was just writing a simulation program on malloc() so some 1 pointed out to me that it must have some elements related to Paging.Now what i no for sure that malloc is primarily meant for dynamic allocation at runtime but i fail to figure out how paging got to do anything with it???

What i think though not sure Paging is meant for giving memory space to the process while it begins its execution and malloc is primarily meant for providing some bytes of heap memory during the execution of the process.

So can any1 tell me in what way malloc() is used in implementing the paging concept?

paul · Post by **paul** » Sun Aug 19, 2007 12:25 pm

Paging is used to allow addresses used by software to map to different addresses in the hardware ,and to stop programs being able to access each others (or the kernels) memory.

Your malloc implementation doesn't need to worry about paging, other than knowing that the memory it is allocating is mapped in the page tables of the app/kernel that the memory is being allocated for. In your kernel, you just need to make sure that malloc only uses memory that you've mapped in page tables for the kernel to access. In apps (I'm guessing thats a while away) the malloc implementation will be using memory it gets from the kernel (in the form of pages/frames) so there'll be no direct worry about paging there either.

/Disclaimer: I'm not an expert, in fact I'm pretty new to this myself, so if I'm wrong I apoligize

JAAman · Post by **JAAman** » Sun Aug 19, 2007 2:26 pm

i was just writing a simulation program on malloc() so some 1 pointed out to me that it must have some elements related to Paging

only thing i can think of, is your malloc will generally make calls to the system to allocate pages for it to use

normally, the first time malloc is called it will call the kernel syscall to allocate pages, then it will sub-allocate those pages till it runes out, then it will call the kernel again

so it must interface with your paging code, but nothing beyond that (afaik, im certainly not an expert as i havent ever written a malloc)

JamesM · Post by **JamesM** » Mon Aug 20, 2007 1:59 am

Malloc allocates a block of contiguous space from a heap. It deals with two concepts:
- The heap structure itself, that is, the list of blocks and holes over a contiguous space that it maintains.
- The contiguous space itself, the size of it, etc.

The first point is an issue purely to do with the internals of the malloc/free mechanism. Totally implementation dependent.

The second point requires some kernel interaction. If you want more space for your heap, you must ask the kernel for some. The way this is done is by calling brk/sbrk. This alters the 'breakpoint' between the heap and the no-mans-land between the heap and stack. Below is a beautiful piece of ascii-art illustrating the memory layout of a process.

Code: Select all

---------- TOP OF MEMORY
 stack
----------
no-mans-land: write/read from here and you'll get SIGSEGV
---------- BREAKPOINT
heap
---------- .end
.data, .bss
---------
.text (executable code)
---------

so sbrk is called with one parameter - how much to alter that breakpoint by. A positive number gives you more available space, a negative number decreases the amount of space you have (free would use this). The internals of the sbrk system call would allocate/free pages as required and possible page out/in frames as well. You don't need to worry about that at all. Just call sbrk with the increase in the amount of space you need, and you're sorted.

(And I have written a malloc()/free() set of functions)

JamesM

JAAman · Post by **JAAman** » Mon Aug 20, 2007 8:29 am

@jamesM:
one problem: your explanation is implementation specific -- it doesnt work on all OSs (from your explanation, my OS wouldnt be capable of using malloc at all), your explaination makes many assumptions about the underlying design of the OS -- some OSs dont use a single value for the end of the memory, in my OS, malloc makes a syscall requesting memory, and the OS returns a pointer to the memory -- this memory may not be contiguous with previously allocated memory at all

to free memory, free will call a different syscall, sending the pointer it recieved from the allocating call

internally, the OS uses a map (not a pointer to the end of the allocated heap) for determining which areas are available and which are in use (and virtual memory can be fragmented)

JamesM · Post by **JamesM** » Tue Aug 21, 2007 1:49 am

JAAMan: Apologies. My kernel uses the same method as that outlined above, as does the linux kernel, I believe (don't quote me on that!)

Yes, the newlib and glibc implementations of malloc have a #define that tell it whether to use mmap or sbrk to obtain memory.

Apologies for the oversight, but that explanation is valid for many kernel implementations.

Solar · Post by **Solar** » Tue Aug 21, 2007 3:33 am

Probably the most confusing thing about malloc() explanations is that people tend to lump everything and the kitchen sink into that function.

As far as memory management is concerned, you need three distinct functionalities.

One, you need a memory manager, which is highly dependent on your environment. It might work on physical addresses without using the MMU at all, it might use the MMU to create virtual address spaces, it might do paging to disk. In the end, it provides the caller with some piece of memory that the caller can use. Examples for this level of functionality are sbrk() or mmap() as mentioned earlier. Keep this functionality seperate, and don't call it "malloc()", because it isn't.

Two, you (probably) want to wrap that environment-dependent stuff into the syntax any C programmer is familiar with - call a function with the required size as parameter, receive a pointer to that memory in return (or a NULL pointer, source of much interesting debugging). Unfortunately, many people call this (kernel-space) function "malloc()" because that's what they are used to from doing user-level coding. It still isn't, so it's probably wise to call it kmalloc() or somesuch.

Three, your user-space API will most likely want to support C coding, so you need the genuine article, malloc(). You could simply use the kmalloc() you already have, but there's a problem: Your kmalloc() has direct access to the memory manager and its internals, while user-space code has to make a system call (unless you don't do memory protection). System calls are expensive, and calls to malloc() are frequent, often for only a couple of bytes at a time. So your user-space malloc() has additional bookkeeping to do: It will make a system call - to kmalloc() or sbrk() or mmap() or some_other_function() - requesting a larger hunk of memory, splice off smaller parts as requested, probably keeping returned pieces of memory in buckets so they can be passed out quicker next time around, and decide when it's time to return a large chunk to the system... look at Doug Lea's dlmalloc() explained for details.

Part one is elementary for your OS. Part two is comparatively simple. Part three can have a major impact on every application running on your system. The difference between a good and a bad user-space malloc() can easily outweight any other "smart" optimization you made in your kernel to make it "really fast".

AJ · Post by AJ » Tue Aug 21, 2007 4:24 am

Hi,

I don't know if you think this is a good or bad idea, but my kernel uses the same malloc as my user-space apps (I did previously have a kmalloc() defined in my kernel).

Malloc has 2 variables - heapstart and heapend. If #KERNEL is defined, my sbrk-a-like function calls a statically linked function to extend the heap. If not, a system call is used. You could do this with a variable rather than a define, but this would mean additional runtime checks which would slow things down a bit.

I don't page-in at allocate-time - my page fault exception handler checks the boundaries of the current heap and pages in with appropriate permission handlers if necessary. The disadvantage of this method is that if you do the following:

Code: Select all

void *myvar = malloc(0x80000000);
fillwithdata(myvar);

It would be quite slow because you have to page fault on every 4kb boundary. The main advantage is that you are not paging in physical RAM unless it is really needed.

I know my rambling is quite random but hope it adds some thoughts!

Cheers,
Adam

Solar · Post by **Solar** » Tue Aug 21, 2007 4:50 am

AJ wrote:Malloc has 2 variables - heapstart and heapend. If #KERNEL is defined, my sbrk-a-like function calls a statically linked function to extend the heap. If not, a system call is used.

Ask yourself, what would happen if someone did this:

Code: Select all

void p1, p2;
for ( size_t i1 = 0; i1 < SIZE_MAX; ++i1 )
{
    p1 = malloc( PAGESIZE * 2 );
    for ( site_t i1 = 0; i2 < SIZE_MAX; ++i2 )
    {
        p2 = malloc( 16 );
    }
    free( p1 );
    free( p2 );
}

A crude example, I know. But a good user-space malloc() would do a maximum of two system calls even for much more complex allocate / release patterns...

I don't page-in at allocate-time - my page fault exception handler checks the boundaries of the current heap and pages in with appropriate permission handlers if necessary.

Does this result in a proper NULL value returned by malloc() if not enough paging space is available to satisfy the request? Is the paging space flagged as reserved so it doesn't get taken into account when another process does a large malloc()?

AFAIK, even contemporary operating systems happen to return valid values from malloc(), then fail later on when page space is exhausted, it's just something you should be aware of rather than being surprised by it.

AJ · Post by AJ » Tue Aug 21, 2007 5:22 am

Solar wrote: a good user-space malloc() would do a maximum of two system calls even for much more complex allocate / release patterns...

Yes - I need to think about this. I wonder if it would be possible to build the intelligence for reducing number of system calls in to sbrk() at the kernel end rather than the user space end? i.e. return a sensible amount of heap space more than requested by the app. That way, you could use whatever lib / language you liked for use space apps and still have efficient (fast - not space-saving!) virtual memory management.

Does this result in a proper NULL value returned by malloc() if not enough paging space is available to satisfy the request?

At present, my OS does the same as most others in that it returns a valid pointer. Thanks for the thought, though - this will definitely be something to look at in the future.

Sorry - hope the above is not too OT - I feel like I've hijacked the thread

Cheers,
Adam

Solar · Post by **Solar** » Tue Aug 21, 2007 6:50 am

AJ wrote:I wonder if it would be possible to build the intelligence for reducing number of system calls in to sbrk() at the kernel end rather than the user space end?

Bad idea. There is always an application designer with requirements too specific to be satisfied by what you did, who will want to replace your caching with some of his own, or some other ready-made (which is surprisingly common, like Boost Pool or Hoard). He can easily replace a user-space malloc() with stuff of his own, but if you integrate caching at the sbrk() level, everyone is stuck with it.

(In fact, the C++ standard library actually employs the concept of allocators, i.e. objects doing the memory handling instead of using the default operator new(). Stacking two caching mechanisms on top of each other usually doesn't result in what you want.)

AJ · Post by AJ » Tue Aug 21, 2007 8:48 am

Point taken