memory allocator ?

Sam111 · Post by **Sam111** » Sun Sep 19, 2010 1:40 pm

Hi , I am getting into writting memory allocators for my OS.

I am trying to understand how malloc , free , ...etc work.

From looking at some source code it seems that it is just allocating memory and keeping track of what chunks are already allocated and being used.

My problem is when you free a chunk of memory and try to allocate space for something that needs more space then the deallocated memory would eventually cause alot of fragmentation in memory.

Is their an equivalent to defragmenting a harddrive but for memory?
The problem I am running into is if I move sections of memory around I will lose the ptr references in my code. Where as harddrives defragmentation does not run into these issues.

I am not even sure if memory defragmentation is even possible if their are pointers pointing to
that space allocated.

For instance
if I do
char * ptr1 = malloc( ... ) ;
ptr2 = ptr1 ;
then start freeing things and defraging memory I would be screwed.
Unless I am missing an algorithm or method of how to do it without destroying references.

Also is their any best known memory allocator out their???
In my opinion I would think just doing linear memory allocation would be no more inferior to other types of allocation. Since all of them can eventually have fragmentation build up.

Any thoughts on somebody with experience with this would be great

Hangin10 · Post by **Hangin10** » Sun Sep 19, 2010 2:55 pm

What I do is upon freeing a chunk, concatenate it with however many free chunks come after it. My malloc also concatenates free chunks as it traverses the list, that way if it can't find a big enough free piece, it knows right away it needs more pages.

Sam111 · Post by **Sam111** » Sun Sep 19, 2010 3:26 pm

My main question is
Is it possible to write code to defragment memory and preserve all the references in code?

Is their a most efficient way of memory allocation? Faster in some case ,...etc etc
What does linux, windows , for memory allocation algorithms?

Is it just a simple linklist of allocated memory that malloc holds and free just deletes the chunk from the list?

Also regular c using function like malloc/free but in c++ they are built in new/delete couldn't I just get around writting memory allocators by using the new/delete keywords or will something go wrong with this?

(i.e using c++ instead of c for coding an OS)

Thanks.

Hangin10 · Post by **Hangin10** » Sun Sep 19, 2010 3:35 pm

Sam111 wrote:My many question is
Is it possible to write code to defragment memory and preserve all the references?

Some garbage collectors move chunks of memory around to maintain one free section and one or more in-use sections. I've never written a GC, but I imagine they would have to know about all the different types in play in order to know where pointers exist to update.

Also is their a most efficient way of memory allocation? Fast in some case ,...etc etc
What does linux, windows ,...etc use?

Most everything uses some kind of malloc, tagging free blocks and such as I described (afaik). Some OS allocate arrays of the various objects they use so that they don't need to use a potentially slower malloc again until they run out of the array. I forget what this is called though.

Is it just a simple linklist of allocated memory and free just deletes the chunk freed from the list.

Yes, just a linked list of chunks (when returning a chunk its just a pointer to the data after the linkedlist/header). However, the list can have all chunks in the list, just need a way to mark chunks used/free. And yes, that's at least 8 bytes of overhead for each chunk (pointer to next, size of chunk (bit 31 indicates free'ness), followed by chunk data).

Also regular c uses function like malloc/free but in c++ they are built in new/delete couldn't I just get around writing memory allocators by using the new/delete keywords or will something go wrong with this?

new and delete just call an equivalent of malloc/free and then call the constructor, so no, that doesn't get you anything for free.

Sam111 · Post by **Sam111** » Sun Sep 19, 2010 3:57 pm

new and delete just call an equivalent of malloc/free and then call the constructor, so no, that doesn't get you anything for free.

Ok, so what would happen if I started using c++ and had new/delete commands in my code.
And told the g++ compiler to use no built in libraries / functions. Would I not beable to use new/delete. Because new/delete are not functions I would think it would work?

Question 2

So how does a windows or linux os do memory allocation.
Does it use just a simple malloc/free algorithm or does it implement a garbage collector?
I would think if it implemented a gc it would be to hard/time consuming to move all the pointers around from various programs when defraging memory?
Unless ofcourse they set the gc to run not very often or they wasn't many malloc/free calls in the code.

But their is so many trade off's under different conditions that it would be impossible to write a best case memory allocator for all cases.

I am wondering how I can write a good/efficent memory allocator which runs fast in most cases.
Maybe the linklist linear memory allocation is the way to go?
Cann't think of any other way that would be more efficient or much more efficient to go thru the trouble of more complex coding.

NickJohnson · Post by **NickJohnson** » Sun Sep 19, 2010 4:20 pm

Sam111 wrote:Ok, so what would happen if I started using c++ and had new/delete commands in my code.
And told the g++ compiler to use no built in libraries / functions. Would I not beable to use new/delete. Because new/delete are not functions I would think it would work?

new and delete are basically functions with a special syntax, and definitely require a heap implementation. They do not work for free.

As far as memory allocator algorithms go, the simplest usable one that I use is this:

An array of "buckets" (linked lists of free pieces of memory) is created, where the size of the free blocks in each bucket is a different power of 2. When a request for allocation is made, the allocator checks if the corresponding bucket has any free blocks, returns one if it does, and if it doesn't, extends the heap enough to add one block to it, then returns that. Freeing goes into the bucket of the correct size. This scheme doesn't free memory completely, gets fragmented irreversibly with a lot of different sized allocations (wasting memory on non-power-of-2 allocations as well), but is pretty fast and should work for anything that's not too demanding and doesn't use lots of strings.

Owen · Post by **Owen** » Sun Sep 19, 2010 4:33 pm

C++ new and delete are the functions

Code: Select all

void* operator new(size_t);
void operator delete(void*, size_t);

There is also placement new (Construct object without allocation), commonly defined in header <new> and generally defined as follows:

Code: Select all

static inline void* operator new(void* p, size_t)
{
    return p;
}

You can define your own allocators, e.g.

Code: Select all

typedef void* mallocFn(size_t sz);
typedef void* freeFn(void* b);
static inline void* operator new(mallocFn* allocator, size_t sz) { void* p = allocator(sz); if(p) return p; throw std::bad_alloc(); }
static inline void  operator delete(void* block, freeFn* deallocator, size_t sz) { deallocator(block); }

C++ construction and destruction is rather complex

Note: This is by memory; I do not guarantee correctness

lemonyii · Post by **lemonyii** » Sun Sep 19, 2010 7:11 pm

new and delete just call an equivalent of malloc/free and then call the constructor.

are you sure?
i'm preparing to write a malloc for my kernel. and do you mean if i have my own malloc/free and then i can use new/delete without overloading them?

You can define your own allocators, e.g.

Code: Select all

typedef void* mallocFn(size_t sz);
typedef void* freeFn(void* b);
static inline void* operator new(mallocFn* allocator, size_t sz) { void* p = allocator(sz); if(p) return p; throw std::bad_alloc(); }
static inline void  operator delete(void* block, freeFn* deallocator, size_t sz) { deallocator(block); }

so, it is not of the same type as the normal one. is it still a overload or how could it be identified?
new int[100] , which does it call?
and i'm wondering if i can just replace the "malloc" in the normal "new" with my one? and can i then use new/delete in kernel just like in applications?
i really need the details. thank you!

gravaera · Post by **gravaera** » Sun Sep 19, 2010 8:07 pm

Hi:

Malloc() and free() are functions defined by the C stdlib. They are meant to portably provide dynamic memory across platforms. C++'s new() and delete() have the same aim. The difference is that in C, allocated objects are not constructed. C++ objects are. Consider the following:

Code: Select all

class foo
{
public:
   int bar;
   foo(){ bar = 5; };
};

int main(void)
{
   foo *f;

   f = malloc(sizeof(foo));
   if (foo != NULL) {
      // Output is undefined. Constructor was not run.
      printf("Value of foo.bar: %d.\n", foo.bar);
      free(f);
   };

   f = new foo;
   if (foo != NULL) {
      // Will output 5.
      printf("Value of foo.bar: %d.\n", foo.bar);
      delete f;
   };
}

The difference is that new() causes the constructor for the type to be called on the object after the allocation succeeds. So:

Code: Select all
new int[100]
What function call is implied here?

void *new[](size_t size).

...And I'm wondering if I can just replace the "malloc" in the normal "new" with mine? Can I then use new/delete in my kernel just like I do with applications?

As long as you have a backing heap implementation and malloc/new call on that, the answer is yes.

I'm preparing to write a malloc() implementation for my kernel. Did you mean that if I have my own malloc/free, I can use new/delete without overloading them?

You're writing a kernel. Operators new(size_t), new[](size_t), new(size_t, void *), and new[](size_t, void *) are all undefined. You aren't going to be overloading them. You'll be defining them. If you have an underlying heap implementation, you should make the standard library functions/C++ allocation operators call that heap implementation.

Is there an equivalent to defragmenting a harddrive for a heap implementation?

There is no "equivalent" which you can just copy and paste. You'll have to design and implement a heap algorithm. Depending on how you design your heap, you may or may not need to defragment.

For example, if I do the following:
Code: Select all
char * ptr1 = malloc( ... ) ;
ptr2 = ptr1 ;
Then start freeing things and defragging memory I would be screwed.

OS/X for the Macintosh is known to have a "handle" system for memory along with its normal heap, where you dereference memory via APIs that lock off the memory area. It allows the underlying implementation to make assumptions when trying to manage memory. That allows the underlying management code to do whatever it wants.

Is it possible to write code to defragment memory and preserve all the references in code?

Your heap will generally be fragmented. You want to keep unallocated blocks tight together. But there is no particular need to keep allocated blocks managed. They're best left alone so the program can use its memory without interference. What you want to do is make sure that when *freeing*, you concatenate memory blocks, much like the person above suggested.

Also, @Sam111, "They're" and "their", *there* is a difference. Learn to differentiate and know when to use each of those.

--This was interesting.
gravaera.

Sam111 · Post by **Sam111** » Sun Sep 19, 2010 9:00 pm

Ok, but now I am trying to figure out what the best place I should place the starting of the heap at in memory and how big I should make it?

I have always wondered is their an optimal size for a stack or heap.
Or what size / location should I place the start of the stack and heap at in memory for my os.

Their must be better size's/locations for different circumstances .

Or reasons to place it in a certain location.... max/min size ,...etc

Any input would be great on this

What does microsoft or linux or mac use for size's for their os's and why.

NickJohnson · Post by **NickJohnson** » Sun Sep 19, 2010 9:09 pm

Well, it doesn't matter where the heap is, as long as it's big enough (RAM is called RAM for a reason); it's size is obviously somehow proportional to how much stuff will be in it: even on the 32 bit x86, there is plenty of virtual memory, so it would fine even to greatly overestimate on the size. Stacks are a little harder to predict, but you can usually just leave space at the top of the heap for them to grow down into as the heap grows up; if this is not possible, a couple of (virtual) megabytes for the stack should be more than enough for even large applications.

Sam111 · Post by **Sam111** » Sun Sep 19, 2010 9:30 pm

ok I have 4GB of memory in my x86 32 bit intel machine.

If I place my stack at 0x1000000 and place my heap at
0x2000000.

Their is still possibility for the heap and the stack to overlay. How can I prevent this or what would be a typical difference in the |starting stack - starting heap| that would be very save not to ever overlap.

Where in memory cann't I place an Os, stack , heap ,...etc?

I know their are memory mapped input/output for vga ,...etc . Is their any particular memory other then VGA memory that must be avoided....

Because this could effect my decision in where to place my os , stack ,heap ,...etc at and the size to use.

Thanks

Hangin10 · Post by **Hangin10** » Sun Sep 19, 2010 9:50 pm

Are you not using paging? If you are using paging, then you don't have to avoid anything (or you "avoid" it by just not mapping it anywhere). If you are not using paging, then yes, you'll have to use the memory map provided by the BIOS to figure out what RAM is safe to use.

As an example of layout for a kernel using paging, my heap has 256MB of address space starting at 0xD0000000. My kernel stacks are between 0xF0000000 and 0xFFC00000.

Yes, in the case where stack area is directly after heap area (and grows down towards the heap), then if they end up colliding, one is simply not large enough. In my kernel, a kernel stack is only 8KB and if the kernel at any point uses more stack than that, something really horrible has happened (I once had a silly typo compile into an infinite recursion).

lemonyii · Post by **lemonyii** » Sun Sep 19, 2010 10:37 pm

ohhh....
to avoid the overlay, you may use paging machinism.
i.e., you put stk top at 2G, which grows down, and put you heap at 1G, which grows up.
then you may mark the page entry at 1.5G as NOT_PRESENT, and mark other fields, advising the AVL field, as some magic number which you can identfy as the "gap".
so whichever of them grow to the gap, your page fault handler can know the program overflowed, just kill it.
i'm writing a 64bit os, i put app.text at 32T, app.data at 64T, the heap follows, and app.stacktop at 128T. so i dont need this at present,

because i dont't think my os have the honor to run on a machine have more than 64T physic memory and 128T swap partition

all the address i refered is linear address.

Creature · Post by **Creature** » Mon Sep 20, 2010 3:57 am

About new and delete, you can implement both of them very easily, AFAIK there is indeed some 'specialty' involved with both operators, but I don't believe we should explictely take note of this. This is about my implementation (I use classes, structures and lots of things that use new, new[], delete and delete[] and they've all worked fine since the beginning, so I guess [but I'm not sure] they are right):

Code: Select all

void operator new(size_t size) throw() { return malloc(size); }
void operator new[](size_t size) throw() { return malloc(size); }
void operator delete(void *p) throw() { free(p); }
void operator delete[](void *p) throw() { free(p); }

The throw() statements may not really be necessary since I'm not using exceptions anyway, but it's always nice to have them there just in case. If I remember correctly, operator new not only allocates memory, but it also makes sure that the constructor of the allocated object is called. operator new[] does the same, but calls the constructor of each allocated object in the range. operator delete and delete[] then call destructors for objects and free the memory. Don't worry about 'size' in new[], doing this should work just fine and give the right size:

Code: Select all

int *p = new int[50]; // Should call operator new[] with a 'size' of sizeof(int) * 50 = usually 200.

You may wonder "why not just use malloc". The reason is the following, imagine a case like this:

Code: Select all

int *p1 = new int; // Will work fine, no problem.
int *p2 = (int *) malloc(sizeof(int)); // No problems here, either.

std::string *s1 = new std::string; // Should work fine, AFAIK, because the constructor is called.
std::string *s2 = (std::string *) malloc(sizeof(std::string)); // This may give you a lot of problems, because this is not a POD.

I believe the C++ operators are mostly important for non-PODs (Plain Old Datatypes). Typical structures containing only member variables (like in C) and PODs have no problems with malloc, free. Even memcpy will work fine on them. A grave mistake to make is to use memcpy on custom classes that require constructors/destructors like your own Vector class. It can cause all sorts of problems. Although a memcpy would be faster, you can't just memcpy non-PODs because in a traditional for-loop, their copy constructors will be called one by one and they will be initialized.

If I'm wrong on any of this, feel free to correct me.

OSDev.org

memory allocator ?

memory allocator ?

Re: memory allocator ?

Re: memory allocator ?

Re: memory allocator ?

Re: memory allocator ?

Re: memory allocator ?

Re: memory allocator ?

Re: memory allocator ?

Re: memory allocator ?

Re: memory allocator ?

Re: memory allocator ?

Re: memory allocator ?

Re: memory allocator ?

Re: memory allocator ?

Re: memory allocator ?