Page 1 of 1

General memory manager question

Posted: Sat Jun 10, 2006 7:08 am
by mr0b
Once i got the basic paging done, a bunch of memory-related questions started to pop up...

What if the kernel(or a userland app for that matter) tries to access a region of memory not currently mapped.
Should the GPF handler map it automagically, and return to the code that caused the exception, or should the kernel map it with a call to map_page_blah() prior to accessing it?

Or perhaps a mixture of the two methods?

Cheers, mr0b

Re:General memory manager question

Posted: Sat Jun 10, 2006 8:03 am
by Ryu
In my not humble opinion, I wouldn't map more pages automatically because this will rise some problems. Take for an example, a misbehaved application tries to access outside regions it was assigned for and continues to go far out, the kernel is continously mapping new pages for it. There is really no way to determine if it is behaving correctly unless you provide certain APIs to do so. However, when a user allocates some region, all pages does not have to be mapped.

Re:General memory manager question

Posted: Sat Jun 10, 2006 1:12 pm
by Brendan
Hi,
mr0b wrote:Should the GPF handler map it automagically, and return to the code that caused the exception, or should the kernel map it with a call to map_page_blah() prior to accessing it?

Or perhaps a mixture of the two methods?
I normally use a mixture of both methods. I have a flag in the executable file header that determines the default behaviour, and then one kernel API function to allocate real RAM and another function to allocate "pretend" RAM (or to set an area as "allocate on demand").

Allocate on demand is good for when you've got things like sparse arrays. For an example, imagine something like this:

Code: Select all

u16 find_least_frequent(u16 *source, u32 size) {
    u32 counts;
    u16 i;
    u16 result = 0;
    u32 lowest_count;

    counts = AOD_calloc(65536 * 4);
    for(i = 0; i < size; i++) {
        counts[ source[i] ]++;
    }

    lowest_count = 0xFFFFFFFF;
    for(i = 0; i < 65536; i++) {
        if(lowest_count > counts[i]) {
            lowest_count = counts[i];
            result = i;
    }
    free(counts);
    return result;
}
This looks like a silly example, but I'm using something similar for some compression code...

Without allocation on demand you end up allocating 256 KB of RAM, but with AOD you only use the RAM if you have to - if the source data is small (or contains a lot of the same values) you might only need a few pages.

A more common example would be something like:

Code: Select all

#define MAX_THINGS   2048

typedef struct {
    unsigned int flags;
    char *name;
    int foo;
    float bar;
} MY_ENTRY;

MY_ENTRY *my_list_of_things[MAX_THINGS];
int total_things = 0;
I've seen this sort of thing done often enough, where "things" are created dynamically in pre-allocated space. How much of the ".bss" section is never touched? It depends on how many "things" are actually used at run time...

Now imagine trying to allocate a 256 MB buffer for something on a computer with 32 MB of RAM installed. In this case you can't actually allocate this much RAM without sending most of it to swap space immediately. Allocation on demand is an easy way to make this work (without having a huge amount of disk I/O while the swap space is filled with empty pages and then more disk I/O when they are accessed the first time).

Ryu is right - it's easy for buggy programs to allocate everything, but then it's easy enough for a buggy program to allocate everything anyway (try doing "malloc()" in an endless loop on a system with plenty of swap space). IMHO some form of limit is a good idea (regardless of whether allocation on demand is used or not)....

IMHO it's just easier to let the applications programmers decide which method is best for their code. ;)


Cheers,

Brendan

Re:General memory manager question

Posted: Sun Jun 11, 2006 10:35 am
by JAAman
@brendan

Without allocation on demand you end up allocating 256 KB of RAM, but with AOD you only use the RAM if you have to - if the source data is small (or contains a lot of the same values) you might only need a few pages.
this isnt necessarily true...

if you require the app to allocate memory ahead of time, the OS doesnt have to actually allocate any physical memory -- only virtual memory, allocating physical memory on demand, but virtual memory only on request, this allows you to trap wild pointers, while still conserving memory in situations where you may not know how much you need (in theory, you could allocate the entire 2GB user space, but only use a few k -- and it would not use any more memory than would allocating only a few k)

unless you are afraid of running out of virtual memory? i guess that could be a consern for some applications

Re:General memory manager question

Posted: Sun Jun 11, 2006 10:59 am
by Ryu
Brendan wrote: Ryu is right - it's easy for buggy programs to allocate everything, but then it's easy enough for a buggy program to allocate everything anyway (try doing "malloc()" in an endless loop on a system with plenty of swap space). IMHO some form of limit is a good idea (regardless of whether allocation on demand is used or not)....
Yes, once its buggy its buggy theres nothing more to say. I was looking at this on a differn't angle, if the application is misbehaved by trying to read/write from an address that is not yet mapped (where it shouldn't be doing r/w), then the kernel would be forced to physically allocate a page for the given address. Whereas the "malloc" (which is partly what I mean on just use APIs) it does not have to actually allocate all the pages until a page boundry is being used.

edit: Whoops didn't hit preview..

Anyways the idea is that using the entire address space is more vurnable to not only bugged programs but perhaps could be intentional to expliot the kernel security. Allocating pages every time an application touches an unmapped region would seem to me like a wild card of bugs and "malloc" is a bottle neck.