A Memory Manager to help developers

RaffoPazzo · Post by **RaffoPazzo** » Thu Apr 14, 2011 6:21 am

Though i'm still too far from there, i'm thinking about the Memory Manager and its heap allocation policy my OS should provide.

Requirement:

I would like the Memory Manager controlling faulty handling (i.e. under- overflow, missed explicit deallocation) of an allocated resource and, once identified, deny all subsequent requests coming from the client previously responsible of such a fault (even if it was caused by a external library it is using, since it is anyway responsible for every requested resource). If needed (should be not), it can stil exist, in some few really trusted situations, a kind of policy able to satisfy requests coming from a client already identified to act faulty.

A kind of first implementation:

When the client ask for a resource to the Memory Manager (memoryManager->alloc()), this one will decorate the memory region with a couple of header/footer containing a magic number (in case the magic number is made of zeros, this header/footer will act like a pad). When explicitly requested by the client (memoryManager->check()), or even implicitly during deallocation (memoryManager->free()), which in turn is called automatically at the client's end of life, the Memory Manager will check this header/footer (or pad) looking for under- overflows; if one is recognized then any following allocation coming from the same client will result in NULL.
Something similar should apply during the client's shutting down procedure, to check if any allocated resource has been explicitly released and, if not, point it out. In this case, however, it could be quite difficult to deny subsequent allocation, since they should occur in a new life cycle for that client.

The expected final result is that the OS itself will help developers to discover and prevent bugs and leaks.

What do you guys think about that? Have you already implemented something similar or even smarter?

Solar · Post by **Solar** » Thu Apr 14, 2011 7:16 am

The first part of what you describe is called "memory protection". The denying of any further accesses by that client is called "segmentation violation", "segfault" or similar, which usually results in the termination of the client process. As such, all major operating systems already implement something like it.

The other thing you describe is called a "canary". (Named so after the canary birds that miners took with them into the mines to detect "bad weather", i.e. noxious gas. If the bird dies, it was time to get the hell out of there.) EFence is one framework using this technique.

RaffoPazzo · Post by **RaffoPazzo** » Thu Apr 14, 2011 12:01 pm

Thank for your opinion but probably i have not been clear enough. Let's try to make things clear.

Solar wrote:The first part of what you describe is called "memory protection".

Memory protection is a technical solution to prevent a process accessing data not related to it, for example paging or segmentation. Moreover, it is strongly hardware dependent. Thus, a process can still overwrite its own heap and if some freaky architecture doesn't support such a method, some work-around (more probably a pork-around) should be thought out.

Solar wrote:The denying of any further accesses by that client is called "segmentation violation", "segfault" or similar, which usually results in the termination of the client process.

A segmentation violation is just the actual fault in accessing reserved memory, so it is the "signal" emitted when the above memory protection recognize such a fault. It is an OS responsibilty (more properly, a strategy) to kill the process, that is stronger than simply deny further resources.

Solar wrote:As such, all major operating systems already implement something like it.

Everything that OSes implement is just the two above and they don't protect any process overwriting its own heap, at least not a "few" bytes. They can just point out that the process was trying to access some memory "too far from home". Probably your OS will will successfully run the following code, printing 2048 'A' (my ubuntu is). Or maybe not, in which case is just a matter of reducing 2048 to just few bytes over 4, let's say 2 for a total of 6 'A's being printed and some other data being corrupted and as far as i know, there is neither a "mallopt()" attribute which can come in help.

Code: Select all

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main() {
        char* buf = (char*)malloc(4*sizeof(char));
        int i;
        for(i=0; i<2048; i++)
                buf[i]='A';
        buf[i]='\0';
        printf("%s\n", buf);
        return 0;
}

Solar wrote:The other thing you describe is called a "canary". (Named so after the canary birds that miners took with them into the mines to detect "bad weather", i.e. noxious gas. If the bird dies, it was time to get the hell out of there.) EFence is one framework using this technique.

I have never used efence, which for sure is a great tool but it isn't provided by the OS while running the process. You have to instrument the code and remove it while deploying, which some time mean that you have deployed exploits.

The idea is to have a platform independent, OS-provided Memory Validation for the process' heap, additionaly to the "stadard" memory protection/segmentation fault Memory Access Control

Solar · Post by **Solar** » Fri Apr 15, 2011 12:06 am

Well, agreed, MP isn't quite what you suggested, but you brushed over the meat of my post:

RaffoPazzo wrote:
Solar wrote:The other thing you describe is called a "canary". (Named so after the canary birds that miners took with them into the mines to detect "bad weather", i.e. noxious gas. If the bird dies, it was time to get the hell out of there.) EFence is one framework using this technique.
I have never used efence, which for sure is a great tool but it isn't provided by the OS while running the process. You have to instrument the code and remove it while deploying, which some time mean that you have deployed exploits.

EFence is merely one implementation using the canary technique. (Amiga Enforcer is another.) I wasn't pointing it out for you to copy & paste the implementation, but to study it, its advantages and disadvantages (like, increased memory usage, performance penalty).

RaffoPazzo · Post by **RaffoPazzo** » Fri Apr 15, 2011 1:09 pm

Thanks for the advice. I hope to have time to have a look, the day i will start writing the memory manager. Today is definitely too early, i'm still trying to get IDT working

Ready4Dis · Post by **Ready4Dis** » Thu Jun 16, 2011 2:29 pm

I have done something very similar in my OS. The only difference is, it merely checks for under-over flow and non-released blocks, it does not limit the application from making other calls however. It is also able to be turned on and off (aka, debug/release) so I can have more performance or less if i'm debugging. I primarily wrote it for use in my own programs (not OS), but have implemented it into my OS libraries. The biggest issue I see, is that most programs don't use the OS for allocating memory, so how could the OS be responsible for a buffer underflow or overflow? My memory tracker/manager/allocator is part of my alloc implementation, which exists in user space. There is no way that while running if it could tell memory wasn't deallocated properly, because while still running, it doesn't know if the memory is still needed. Once it exits, there is no way to limit the exiting program from allocating memory, because on exit it won't be allocating anyways. It *could* limit the program if it detects a buffer underflow/overflow during a free/delete call, which wouldn't be much to add honestly, just a flag and a simple check during allocation for my debug version. It's not really much extra to add if you already have a memory allocator, simply PAD the buffer (mine has an adjustable size just incase something is really far off) front and back a certain # of bytes. Fill it with a specific byte sequence (don't use all zero's, because if something just clears a buffer you'd never catch it). Then on deallocation, you check the padding to see if it's what you expect. It's really not much extra work, and like I said, I have mine setup so I can compile with or without memory debugging to allow me to debug when I want to, but get rid of the memory waste and slowness of debugging if i want to test release. Implement it, and if you have any questions, feel free to ask.

Brendan · Post by **Brendan** » Thu Jun 16, 2011 11:42 pm

Hi,

RaffoPazzo wrote:Have you already implemented something similar or even smarter?

In C, I tend to hack together wrappers around malloc(), realloc() and free() (I never use "calloc()"). The wrappers insert canaries and check that the canaries are still present, and set a "heap got trashed" flag and return NULL for any allocations (and ignore any "free()") if the "heap got trashed" flag is set. They also track the number of allocated bytes, then number of allocated blocks, the maximum number of allocated bytes and the maximum number of allocated blocks. When the program exits I check for "heap got trashed" and "current blocks allocated != 0" and do error messages, and (optionally) display statistics (e.g. "Allocated a maximum of 1234567 bytes, used a maximum of 1234 blocks").

On top of that I use conditional code - "#define FAST_EXIT" to disable the "current blocks allocated != 0" check at exit and to skip half the deallocations when the program exits (much faster to let the kernel's virtual memory manager free any/all left over pages rather than messing about freeing each little thing yourself when you don't have to). I probably should also use something like "#define NO_HEAP_CHECKING" to disable the whole canary hack (for better performance), but to be honest I'm too lazy and haven't actually done that (yet).

RaffoPazzo wrote:What do you guys think about that?

Typically there's 3 layers of "memory management":

whatever the process (and/or libraries the process uses) feel like using for convenience in user space (whether it be malloc/free or garbage collection or whatever). The kernel shouldn't know anything about this and doesn't care what the process does.
the kernel's virtual memory management (which is used by "whatever the process uses for convenience" to allocate and free virtual pages). This might be "sbrk()" or "mmap()" or something else. The kernel knows/cares about this. The "whatever the process uses for convenience" part of the process thinks it knows about this (but it's called "virtual" for a reason - e.g. just because the process' heap manager thinks there's RAM somewhere doesn't mean there actually is).
the kernel's physical memory management (which is used by the kernel's virtual memory management to allocate and free actual pages of RAM). The process doesn't know or care about this.

Basically, in my opinion, the kernel shouldn't know or care what the process uses for its heap; and the process should be free to use whatever it likes (without being forced to use something inappropriate because of the kernel), including many different "fast as possible with no checks" implementations of malloc/free that are tuned for specific applications and/or usage patterns, a "robust as possible" malloc/free (for developers), a "more robust than possible" system like Valgrind, about 10 different variations on the "garbage collector" theme, explicit virtual memory management (for things like databases and virtual machines, where a "heap" abstraction just gets in the way), and whatever else a process might feel like using (objstacks? transactional memory emulation? Too hard to predict really).

Cheers,

Brendan

OSDev.org

A Memory Manager to help developers

A Memory Manager to help developers

Re: A Memory Manager to help developers

Re: A Memory Manager to help developers

Re: A Memory Manager to help developers

Re: A Memory Manager to help developers

Re: A Memory Manager to help developers

Re: A Memory Manager to help developers