Heap Management

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
ChristianF
Member
Member
Posts: 79
Joined: Sun Jun 10, 2007 11:36 am

Heap Management

Post by ChristianF »

Hi there,
i am just working on my Heap Manager.
I am implementing it with a simple linked list an want to know your opinion about my code. Also I don't know if it is secure implemented and i have to test it too, but it should work. ;)

Here is the existing code of my Heap Manager:

Code: Select all

/* Create a new heap */
heap_t *create_new_heap(unsigned int start, unsigned int end, unsigned int max, unsigned int supervisor, unsigned int readonly)
{
	heap_t *new_heap	= NULL;
	heap_block_t *block	= NULL;

	/* Need to do some security checks right here */

	/* place new_heap at start */
	new_heap = (heap_t *)start;
	memset(new_heap, 0, sizeof(heap_t));

	/* place one element behind the heap */
	block = (heap_block_t *)start + sizeof(heap_t);
	memset(block, 0, sizeof(heap_block_t));

	/* Save new start position */
	start += sizeof(heap_t);
	start += sizeof(heap_block_t);

	/* set default heap management informations */
	new_heap->start_address	= start;
	new_heap->end_address	= end;
	new_heap->max_address	= max;
	new_heap->supervisor	= (supervisor)?true:false;
	new_heap->readonly		= (readonly)?true:false;

	/* Initialise Heap Management List with one big free block */
	new_heap->first_block	= block;
	new_heap->last_block	= block;

	/* Initialise the first big block */
	block->used = false;
	block->size = end - start;

	/* At last return the new heap */
	return(new_heap)
}
/*
 * Function searches the heap for a free block
 */
void *heap_alloc(unsigned int size, heap_t *heap)
{
	heap_block_t *block		= NULL;
	heap_block_t *new_block	= NULL;
	unsigned int local_size	= 0;

	/* Check wheather heap is allocated or not */
	if(heap == NULL)
		return(NULL);

	/* Check for a proper initialisation of heap */
	if(heap->first_block == NULL && heap->last_block == NULL)
		return(NULL);

	/* Don't allocate a block with 0 bytes */
	if(size == 0)
		return(NULL);

	/* loop through heap and check each element which is free if one free block
	 * is greater then the wanted size, split it */
	
	for(block = heap->first_block, local_size = block->size;
		block != NULL;
		local_size += (sizeof(heap_block_t) + block->size), block = block->next)
	{
		if(block->used)
			continue;
		if(block->size < size)
			continue;

		/* Something like this will never happen ;) */
		if(block->size == size)
		{
			/* Set block as used */
			block->used = true;
			/* return the address */
			return(heap->start_address + local_size);
		}
		else
		{
			/* Save the address of the new block */
			new_block = (heap_block_t *)(heap->start_address + local_size) + sizeof(heap_bock_t) + size;
		
			/* Attach new block to the list */
			heap->last_block->next = new_block;
			heap->last_block = new_block;

			/* Save block Informations */
			new_block->used = false;
			new_block->size = block->size - size;

			return(heap->start_address + local_size);
		}
	}

	return(NULL);
}

So now there is the Heap Management Structure and the structure for a block:

Code: Select all

typedef struct heap_block heap_block_t;
typedef struct heap_block
{
	bool used;
	unsigned int size;
	heap_block_t *next;
} heap_block_t;

typedef struct heap
{
	unsigned int start_address;
	unsigned int end_address;
	unsigned int max_address;
	bool supervisor;
	bool readonly;

	heap_block_t *first_block;
	heap_block_t *last_block;
} heap_t;

Cheers Christian
42
ru2aqare
Member
Member
Posts: 342
Joined: Fri Jul 11, 2008 5:15 am
Location: Hungary

Re: Heap Management

Post by ru2aqare »

OdinPG wrote:Hi there,
i am just working on my Heap Manager.
I am implementing it with a simple linked list an want to know your opinion about my code. Also I don't know if it is secure implemented and i have to test it too, but it should work. ;)

Here is the existing code of my Heap Manager: ...
Cheers Christian
The code seems fine. However, there are some things that are not clear to me. What do the supervisor and readonly bits do? There is no documentation for them. Also using the local_size variable to calculate the start of a new block could be done with some casting, eliminating the need for the local variable.

Performance-wise, a linked list is a bad choice. A binary tree is much faster (although somewhat more difficult to implement). When you have N blocks allocated, allocating or freeing another block requires O(N) operations for the linked list, and O(logN) operations for a binary tree. When the heap is really fragmented, the speed difference will be measurable.
ChristianF
Member
Member
Posts: 79
Joined: Sun Jun 10, 2007 11:36 am

Re: Heap Management

Post by ChristianF »

ru2aqare wrote:
OdinPG wrote: The code seems fine. However, there are some things that are not clear to me. What do the supervisor and readonly bits do?
Hmmm...
It would be needed if the heap is completely in use and have to be expanded.
So if a new page is needed, it will be allocated with this settings. It is in my opinion needed when i create an extra heap-pool for my System Calls, e.g.! There i would set supervisor to false and readonly to true, so that the "System Call"-area is readable from Userspace.
ru2aqare wrote:
OdinPG wrote: Performance-wise, a linked list is a bad choice. A binary tree is much faster (although somewhat more difficult to implement). When you have N blocks allocated, allocating or freeing another block requires O(N) operations for the linked list, and O(logN) operations for a binary tree. When the heap is really fragmented, the speed difference will be measurable.
I will have a look at Binary Trees. Do you have an example for this?
I will search the internet, but maybe you have one right there :wink:

Cheers Christian


*EDIT*
Ok...
I just found a little example. I will look, how binary trees work and try to implement it :D
Here is what I found with google:
http://www.macs.hw.ac.uk/~rjp/Coursewww/Cwww/tree.html
42
ru2aqare
Member
Member
Posts: 342
Joined: Fri Jul 11, 2008 5:15 am
Location: Hungary

Re: Heap Management

Post by ru2aqare »

OdinPG wrote: Hmmm...
It would be needed if the heap is completely in use and have to be expanded.
So if a new page is needed, it will be allocated with this settings. It is in my opinion needed when i create an extra heap-pool for my System Calls, e.g.! There i would set supervisor to false and readonly to true, so that the "System Call"-area is readable from Userspace.
The paging mechanism already provides a way for making memory read-only to user code, but read-write to kernel code. I don't see why you would need an extra bit for this, which can easily be ignored by malicious user code. This is assuming you execute user code in ring 3 - if you execute user code in ring 0, it already can do whatever it wants to the CPU and memory.
OdinPG wrote: I will have a look at Binary Trees. Do you have an example for this?
I will search the internet, but maybe you have one right there :wink:

Cheers Christian
For a fairly complete and bug-free AVL tree implementation, take a peek at the attached source. It is actively used in my kernel. Of course your could use red-black trees (which have are faster for lookups, but perform worse when doing insertion or deletion), or any other binary tree (even B-trees could be used). The code is based on some tutorial I saw floating around on the net a few years ago. I implemented a test version of the code in C#, and after having completely understood what is going on, I re-implemented the code in C. (If I remember correctly, I found only one or two bugs after that - another reason why managed languages like C# and Java can be useful.)
In my kernel there are actually two types of heap managers: a very basic, best-fit allocator that uses linked lists (like your code) and an AVL-tree based heap manager. The basic allocator is used only by the boot loader, and by the kernel when there is a kernel panic. The other manager is used otherwise.
The binary trees are used to group blocks (both allocated and free blocks) of similar sizes together. This is done so that a request for example 30 bytes will search the tree containing blocks larger than 16, but smaller than 32 bytes. A request for 34 bytes will search the tree containing blocks larger than 32, but smaller than 64 bytes, and so on. If a request for example 30 bytes cannot be satisfied by allocating a block of 32 bytes, then the tree containing blocks up to 64 bytes is searched, then the next tree, and so on. Also on releasing blocks of memory, adjacent free blocks are united into a larger one. This is done so that the heap doesnt become too fragmented.
Attachments
avl_tree.rar
AVL tree implementation.
(4.98 KiB) Downloaded 227 times
ChristianF
Member
Member
Posts: 79
Joined: Sun Jun 10, 2007 11:36 am

Re: Heap Management

Post by ChristianF »

ru2aqare wrote:
OdinPG wrote: Hmmm...
It would be needed if the heap is completely in use and have to be expanded.
So if a new page is needed, it will be allocated with this settings. It is in my opinion needed when i create an extra heap-pool for my System Calls, e.g.! There i would set supervisor to false and readonly to true, so that the "System Call"-area is readable from Userspace.
The paging mechanism already provides a way for making memory read-only to user code, but read-write to kernel code. I don't see why you would need an extra bit for this, which can easily be ignored by malicious user code. This is assuming you execute user code in ring 3 - if you execute user code in ring 0, it already can do whatever it wants to the CPU and memory.

It isn't for additional protection. It is like this:
I initialise the heap with say 1 MB. Now these Memory is completely in use and it is needed to expand it. So we need some new pages.
The supervisor and readonly flags are for new pages allocated for this heap, when the Heap is completely in use.

Cheers Christian

*EDIT 1*
I took a look at your code and have a little Question.
If I would use this code for my heap, where did I know, which address is with which node?
Are you sorting it with an id or address or how do you handle this?

I don't want to use your code, because I haven't used trees until now, so it is better if i do it myself. :wink:

*EDIT 2*
Okay.
I understand now how these binary trees are working (wikipedia helps a lot). The only difficult thing is to rebalance the Tree, but i think i will get this.

*EDIT 3*
So now i have made an structure which looks like this:

Code: Select all

typedef struct tree_node tree_node_t;
typedef struct tree_node
{
    /*
     * I actually don't know where the tree will be used
     * so i use a void pointer, because then it can be
     * used everywhere.
     */
    void *pv_data;

    /*
     * The balance-factor should be 1, 0 or -1 if it contains
     * something different the tree needs to get rebalanced.
     */
    unsigned short balance;

    /* two pointer to the sub trees */
    tree_node_t *left, *right;
} tree_node_t;
In my new Heap-Structure i added a pointer which is the root-node:

Code: Select all

typedef struct heap
{
	unsigned int start_address;
	unsigned int end_address;
	unsigned int max_address;
	bool supervisor;
	bool readonly;

	/* May Be this should be put into an extra structure */
	unsigned int tree_height;
	tree_node_t *root_tree;
} heap_t;
So when I create a new heap I place the heap-structure at the startaddress like it have been done above.
Further more I have an structure for a memory block, which looks like this:

Code: Select all

typedef struct heap_block
{
	bool used;
	unsigned int size;
} heap_block_t;
Now when I want to allocate for example 12 Bytes, which is saved in size, I have to add the size of the tree_node_t and heap_block_t, so that size gets calculated like this:

Code: Select all

size += (sizeof(heap_block_t) + sizeof(tree_node_t));
Is this right?
The Structure seems to be the same which is used in other trees, because all have a left and a right element. The difference is only the Field "balanced".
42
ChristianF
Member
Member
Posts: 79
Joined: Sun Jun 10, 2007 11:36 am

Re: Heap Management

Post by ChristianF »

Hi,
I have a little Question.
Is it good to order the vfl-tree with the address stored in "void *pv_data;"?
pv_data contains the start-Address of a memory Block, so I think it is not wrong sorting with this criteria.

I am asking because if this is not a very good way, so I have to look for another one.
The Problem is, that I want to use the tree in some other parts of my kernel too.

Cheers Christian
42
ru2aqare
Member
Member
Posts: 342
Joined: Fri Jul 11, 2008 5:15 am
Location: Hungary

Re: Heap Management

Post by ru2aqare »

OdinPG wrote:Hi,
I have a little Question.
Is it good to order the vfl-tree with the address stored in "void *pv_data;"?
pv_data contains the start-Address of a memory Block, so I think it is not wrong sorting with this criteria.

I am asking because if this is not a very good way, so I have to look for another one.
The Problem is, that I want to use the tree in some other parts of my kernel too.

Cheers Christian
Generally, yes. You can order the tree nodes on anything, as long as the ordering is consistent. Usually the "void* data" field is used somehow. (and it's AVL-tree, not vfl-tree).

My implementation, however, does this another way. There is no "data" field in the tree node structure (like "void* this_field_points_to_data"). The tree node structure itself is meant to be included in the structure that you want to store in the AVL-tree. Like

Code: Select all

typedef struct SomeDataToStoreInATreeNode
{
  // The tree node structure is included here.
  AvlNode Node;
  
  // Other fields come here...
  int MyData;
} ...;
This way, you don't allocate two blocks of memory for storing one data item in the tree (the tree node structure and the data). This reduces heap fragmentation and improves speed.
ChristianF
Member
Member
Posts: 79
Joined: Sun Jun 10, 2007 11:36 am

Re: Heap Management

Post by ChristianF »

ru2aqare wrote:(and it's AVL-tree, not vfl-tree).
I meant that, but wrote the wrong. :oops:

When I would implement it like your way, when do I know which node have to be changed?
This means that I have to write every time I use the Tree in an different part of my kernel the function for Balancing the tree too, because of the different Structures.
Is this right?

Cheers Christian
42
ru2aqare
Member
Member
Posts: 342
Joined: Fri Jul 11, 2008 5:15 am
Location: Hungary

Re: Heap Management

Post by ru2aqare »

OdinPG wrote:
When I would implement it like your way, when do I know which node have to be changed?
This means that I have to write every time I use the Tree in an different part of my kernel the function for Balancing the tree too, because of the different Structures.
Is this right?

Cheers Christian
You don't need to. Take for example the following two structures:

Code: Select all

typedef struct StructureA
{
  // Allow structure to be put in a tree.
  AvlNode Node;

  // Data.
  uint SomeData;
} ...;

typedef struct StructureB
{
  // Allow structure to be put in a tree.
  AvlNode Node;

  // Data.
  void* SomeData;
} ...;
Both of them have the AVL tree node structure as a field. So you can do

Code: Select all

StructureA* a = ...;
StructureB* b = ...;

AvlInsert(&Tree1, &a->Node);
AvlInsert(&Tree2, &b->Node);
The AVL tree code only knows about the AVL node structure. It doesn't matter if there are fields after or before the node structure. You could even do

Code: Select all

typedef StructureC
{
  // Data.
  void* Data;

  // First node.
  AvlNode Node1;

  // Second node.
  AvlNode Node2;
} ...;
and insert the same structure into two trees at the same time. For example you could sort the same data according to two
different criteria. Though I don't know if anybody actually does this, but it is possible.

EDIT: if you are referring to the function that compares two tree nodes to decide the ordering of the nodes in the AVL tree, then yes, you need to write a different function for each structure. In my experience, however, most of these functions will be as simple as

Code: Select all

int CompareStructureA(StructureA* itemA, StructureA* itemB)
{
  if (a->Data < b->Data)
    return -1;
  if (a->Data > b->Data)
    return +1;
  else
    return 0;
}
jal
Member
Member
Posts: 1385
Joined: Wed Oct 31, 2007 9:09 am

Re: Heap Management

Post by jal »

ru2aqare wrote:a very basic, best-fit allocator
I remember reading a while ago that first-fit is better than best-fit. Cannot remember the reasoning though...


JAL
ru2aqare
Member
Member
Posts: 342
Joined: Fri Jul 11, 2008 5:15 am
Location: Hungary

Re: Heap Management

Post by ru2aqare »

jal wrote:
ru2aqare wrote:a very basic, best-fit allocator
I remember reading a while ago that first-fit is better than best-fit. Cannot remember the reasoning though...


JAL
I can only assume that first-fit would be faster. Best-fit has to examine all the blocks in the heap on each allocation, first-fit does not have to do so. However, this allocator is only used by my boot loader, and when the kernel experiences a panic (or BSOD). In both cases, performance doesn't matter that much.
User avatar
AJ
Member
Member
Posts: 2646
Joined: Sun Oct 22, 2006 7:01 am
Location: Devon, UK
Contact:

Re: Heap Management

Post by AJ »

ru2aqare wrote:Best-fit has to examine all the blocks in the heap on each allocation...
One way to reduce the time this takes is to order the blocks (in a linked list) by size rather than base address. Of course, you then have to insert blocks in the correct place but that should still result in less overhead than searching the entire heap for a best fit on each allocation.

Cheers,
Adam
ChristianF
Member
Member
Posts: 79
Joined: Sun Jun 10, 2007 11:36 am

Re: Heap Management

Post by ChristianF »

Humm....
I write programs in C for several years now, but there is one thing i found in that code, which I've never done before.
This is the line of Code:

Code: Select all

typedef ComparisonResult (*pAvlCompareCallback)(pcAvlNode ItemA, pcAvlNode ItemB, pVoid Param);
It defines a new type that is clear to me. The line could also looks like this:

Code: Select all

typedef ComparisonResult (*pAvlCompareCallback)(AvlNode *ItemA, const AvlNode *ItemB, void *Param);
It seems to be a pointer, with the Name pAvlCompareCallback, to a function, which must have those three Parameters and a pointer as returnvalue.
Is that right?

Cheers Christian


EDIT:
Sorry for my stupid Question. I understood it right now.
It is a pointer to a Function which must have those tree parameters. The Return-Value is the enum ComparisonResult.
42
ru2aqare
Member
Member
Posts: 342
Joined: Fri Jul 11, 2008 5:15 am
Location: Hungary

Re: Heap Management

Post by ru2aqare »

OdinPG wrote:Humm....
I write programs in C for several years now, but there is one thing i found in that code, which I've never done before.
This is the line of Code:

Code: Select all

typedef ComparisonResult (*pAvlCompareCallback)(pcAvlNode ItemA, pcAvlNode ItemB, pVoid Param);
It defines a new type that is clear to me. The line could also looks like this:

Code: Select all

typedef ComparisonResult (*pAvlCompareCallback)(AvlNode *ItemA, const AvlNode *ItemB, void *Param);
It seems to be a pointer, with the Name pAvlCompareCallback, to a function, which must have those three Parameters and a pointer as returnvalue.
Is that right?

Cheers Christian
Indeed it is a pointer to a function. Don't let the lots of 'p' prefixes get to you. I have programmed in Pascal for many years before moving to C, and picked up the habit of prefixing any pointer type with a p (instead of typing out the *).
ChristianF
Member
Member
Posts: 79
Joined: Sun Jun 10, 2007 11:36 am

Re: Heap Management

Post by ChristianF »

Okay.
I had no time until now, so I worked only a little bit.
My new implementation looks like this:

Code: Select all

typedef struct avl_node avl_node_t;
typedef struct avl_node
{
    avl_node_t *parent;

    avl_node_t *left;
    avl_node_t *right;

    unsigned int balance;

    unsigned int is_root;
} avl_node_t;
/*
 * Function inserts a new Node
 */
void insert_avl_node(avl_node_t *root, avl_node_t *node, unsigned int num_parameters_compare);
/*
 * Function compares to Integer Values.
 * It returns 0 if param1 is greater then param2 and 1 if param2 is above param1
 */
unsigned int compare_avl_node(unsigned int *param1, unsigned int *param2);
/*
 * Function rebalance the tree if needed
 */
void rebalance_avl_tree(avl_node_t *root);
 
 
 
 
 
 
void internal_insert_avl_node(avl_node_t *root, avl_node_t *node, unsigned int direction, unsigned int num_parameters_compare)
{
    unsigned int new_direction = -1;

    if(root == NULL)
        panic("Error: root is NULL!");
    if(node == NULL)
        panic("Error: node is NULL!");

    switch(direction)
    {
        /* add node on the left side */
        case 0:
        {
            if(root->left == NULL)
                root->left = node;
            else
            {
                new_direction = compare_avl_node(   node - num_parameters_compare,
                                                    root->left - num_parameters_compare);
                insert_avl_node(root->left, node, new_direction);
            }
        } break;
        /* add node on the left side */
        case 1:
        {
            if(root->right == NULL)
                root->right = node;
            else
            {
                new_direction = compare_avl_node(   node - num_parameters_compare,
                                                    root->right - num_parameters_compare);
                insert_avl_node(root->right, node, new_direction);
            }
        } break;
        default:
            panic("Error: direction have to be 0 or 1!");
    }
}

void insert_avl_node(avl_node_t *root, avl_node_t *node, unsigned int num_parameters_compare)
{
    if(root == NULL)
        panic("Error: root is NULL!");
    if(node == NULL)
        panic("Error: node is NULL!");

    /* Get the direction (0 or 1) */
    unsigned int direction = compare_avl_node(  root - num_parameters_compare,
                                                node - num_parameters_compare);

    /* Insert the new node */
    internal_insert_avl_node(root, node, direction, num_parameters_compare);

    /* Rebalance the changed tree */
    rebalance_avl_tree(root);
}

unsigned int compare_avl_node(unsigned int *param1, unsigned int *param2)
{
    unsigned int retval = ((*param1)>(*param2))?1:0;
    return( retval );
}

void rebalance_avl_tree(avl_node_t *root)
{
    /* ToDo */
}

I do it like this, because i only want to compare integers, for example adresses, process ids, ...
There is may be one thing i have to explain:
num_parameters_compare says how much parameters are until the comparison value.
A Little Example:

Code: Select all

typedef struct mem_block
{
    unsigned int address;
    unsigned int size;
    avl_node_t *node;
} mem_block_t;
So when i want to compare the adresses I have to write the following:

Code: Select all

unsigned int address = *(node - 2);

What is your opinion about this implementation? Is it good or bad?

Cheers Christian
42
Post Reply