Understanding memory allocation for a process manually ?

manoj9372 · Post by **manoj9372** » Mon Feb 15, 2016 9:26 pm

As the title how can i understand how memory is allocated to a process?

so far i have read the "operating system concepts by abraham silberschatz" i understood what is a process,what are its states,it's representation(i.e process control block),the scheduling algorithms etc..

but i am trying to understand it in a practical way,say for example if i click notepad.exe in windows the system will start a new process,but how exactly the memory(i.e base and limit registers etc..)
are allocated ? on what basis the memory is allocated to a process? are they remain static or dynamic throught the process execution ?

i want to know this process manually for my knowledge,can some mind explain with a small example ?

Brendan · Post by **Brendan** » Tue Feb 16, 2016 6:51 am

Hi,

manoj9372 wrote:i want to know this process manually for my knowledge,can some mind explain with a small example ?

Think of it as 3 layers.

The lowest layer is managing physical memory. This can be done in multiple ways (possibly including using multiple ways at the same time, for different areas of the physical address space). This is typically built into the kernel.

The middle layer is managing virtual memory. This includes creating/destroying virtual address spaces; keeping track of what each areas in a virtual address space is being used for, and mapping/unmapping things into virtual address spaces (physical pages, swap space, memory mapped files, shared memory areas, etc). This is typically built into the kernel.

The highest layer is things like heap (malloc/free, new/delete, garbage collectors, object pools, etc). This is typically built into a language's libraries (e.g. C standard library) or a language's run-time environment (e.g. Java virtual machine) or done by the process itself.

For an example; a program might call "malloc()" to allocate 12345 bytes, which might be implemented in a shared library. The library's "malloc()" might search a linked list for a large enough block of free memory to allocate but find out there isn't one, so it asks kernel's virtual memory manager to increase the size of its pool, which might be like "Dear virtual memory manager, for all virtual pages between 0x12300000 to 0x1240000 in my virtual address space, change their type to the allocate on demand type". The virtual memory manager might respond by changing the requested virtual pages from whatever type they were before (probably the "unused" type) to the new type (the "allocate on demand" type); which might not involve allocating any physical RAM at all (but might involve allocating a physical page to use for a page table). After that; the library's "malloc()" might split the new (65536 byte) area into a 12345 byte area and a 53191 byte area, add the 53191 byte area to its linked list of free blocks for later, and return the virtual address of the 12345 byte area to the program.

When the program writes to a virtual page that has the "allocate on demand" type, it causes a page fault, and the kernel's page fault handler (virtual memory manager) allocates a physical page (from the physical memory manager), maps the physical page into the virtual address space and changes the virtual page's type to "usable RAM", then returns to the program that caused the page fault. The program doesn't know or care that this happened - as far as it knows, the page was always usable RAM. When "malloc()" added the remaining 53191 byte area to its memory pool it might have modified something in one of the virtual pages and caused a physical page to be allocated, but the majority of that 53191 byte area wouldn't be wasting any actual RAM (until it's used later). When the program starts storing data in the 12345 byte area it got from "malloc()" it might cause more physical pages to be allocated.

Of course maybe a program asked "malloc()" to allocate 987 GiB and only used 6 bytes. It doesn't matter (it's not like it'd actually use 987 GB of RAM).

Cheers,

Brendan

Antti · Post by **Antti** » Tue Feb 16, 2016 7:58 am

What if there were 100 malicious programs that allocated huge memory areas, and suddenly start actually writing to them? Could you guarantee that a little innocent program that happens to write to its own allocated area survives in the middle of chaos those malicious programs are causing to the system, i.e. running out of physical memory and disk swap space and so on.

Brendan · Post by **Brendan** » Tue Feb 16, 2016 8:26 am

Hi,

Antti wrote:What if there were 100 malicious programs that allocated huge memory areas, and suddenly start actually writing to them? Could you guarantee that a little innocent program that happens to write to its own allocated area survives in the middle of chaos those malicious programs are causing to the system, i.e. running out of physical memory and disk swap space and so on.

You either:

decide not to support "overcommit"; and refuse to do anything that causes "amount of pretend RAM committed to" to exceed "amount of real RAM plus swap space".
decide to support "overcommit", hope the problem doesn't happen, and provide a way to deal with it if it does happen (e.g. "OOM killer").

Cheers,

Brendan

manoj9372 · Post by **manoj9372** » Wed Feb 17, 2016 4:44 am

Thanks for your detailed explanation,but one thing wasn't clear to me
As you said "a program might call "malloc()" to allocate 12345 bytes",i want to know on what basis the process know that it needs "x" amount of memory ?
that is where i need more light ,can you explain a bit there ?

Nutterts · Post by **Nutterts** » Wed Feb 17, 2016 6:42 am

manoj9372 wrote:Thanks for your detailed explanation,but one thing wasn't clear to me
As you said "a program might call "malloc()" to allocate 12345 bytes",i want to know on what basis the process know that it needs "x" amount of memory ?
that is where i need more light ,can you explain a bit there ?

Basicly the person who programmed the code running in that process decided it needs "x" amount of memory.

Brendan explained (very detailed) how memory allocation itself works. Like how a car is able to move. But asking on what basis the process knows that it needs "x" amount of memory is like me asking you on what basis you know where to drive your car.

Brendan · Post by **Brendan** » Wed Feb 17, 2016 11:01 am

Hi,

manoj9372 wrote:Thanks for your detailed explanation,but one thing wasn't clear to me
As you said "a program might call "malloc()" to allocate 12345 bytes",i want to know on what basis the process know that it needs "x" amount of memory ?
that is where i need more light ,can you explain a bit there ?

When writing software, a programmer often realises they need memory to store something, and adds pieces of code like "temp_triangle = malloc( sizeof(triangle_structure) );" or "person_object = new person(name, age);" where the compiler calculates how large it is and generates code that does something like "malloc(1234);". In some cases the programmer asks for a specific number of bytes, like "char myBuffer = malloc(1234);".

In either case, the number of bytes to allocate is determined (directly or indirectly) by what the programmer told the compiler they want.

Note that it's entirely possible for a programmer to do it themselves without using whatever the language provides. For example, for the code I'm working on now, it only needs to be able to free the most recently allocated memory (and most of the allocated memory is never freed); so I'm using something like this:

Code: Select all

void free(address) {
    first_unused_byte = address;
}

void * allocate(size) {
    size = (size + 7) & 0xFFFFFFF8;       // Round size up to keep everything aligned
    if( first_unused_byte + size > pool_top) {
        pages_wanted = (first_unused_byte + size - first_unallocated_byte + extra_for_next_time) / page_size;
        if(alloc_virtual_pages( pool_top, pages_wanted) != OK) return NULL;
        pool_top += pages_wanted * page_size;
    }
    address = first_unused_byte;
    first_unused_byte += size;
    return address;
}

Cheers,

Brendan

linguofreak · Post by **linguofreak** » Thu Feb 18, 2016 4:28 am

Antti wrote:What if there were 100 malicious programs that allocated huge memory areas, and suddenly start actually writing to them? Could you guarantee that a little innocent program that happens to write to its own allocated area survives in the middle of chaos those malicious programs are causing to the system, i.e. running out of physical memory and disk swap space and so on.

The only way to absolutely guarantee it is to have a OOM killer that queries the user for which task should be killed, because only the user knows what is "innocent" and what is "malicious".

LtG · Post by **LtG** » Thu Feb 18, 2016 5:11 am

linguofreak wrote: The only way to absolutely guarantee it is to have a OOM killer that queries the user for which task should be killed, because only the user knows what is "innocent" and what is "malicious".

I'm not sure how feasible that is, are "normal" end users supposed to know what is malicious and what is innocent? Do they in practice?

Also, asking really only works on a interactive desktop type of scenario, or how would you do it on a console/shell/prompt situation? Shells are generally expected to do what you tell them to do, not to come up with their own questions all of a sudden while you're using 'vi' or something.

What about servers? The system freezes, waiting for user to answer what to kill, but server admins don't typically look at each server every minute of the day, so it might take quite a while before someone realizes to check the server. Of course SSH into the server might no longer work because SSH too needs memory, so the moment it tries to spawn it gets frozen due to OOM condition. So you basically get yourself in a chicken and egg situation.

In extreme conditions the user might not even be able to open the task manager to check what to kill because of the chicken-egg.

Most obvious "solution" is to not allow over commitment at all but that wastes resources. I tried thinking what solutions are used in real world, like air lines which over commit (over booking), but for them the solution is a lot easier because they can just postpone flights for some, since each flyer is "independent" from each other and thus can be individually postponed. The same doesn't work for two reasons:
1) Figuring out which process would fit in the memory (on the flight) is more difficult because all of the memory for a process is needed at the same time, you can't divide it. So smaller programs, possibly with memory restrictions (like the ELF header stating max memory) would help here.
2) Worst case, there's a 1000t person, that person will never fit a flight, similarly if you have 1GB RAM and 1GB swap, any process that will need 3GB of memory will never fit.

So at least with air lines I can't think of how real world solutions would help..

I suppose if each process had min/max (like in the ELF header) as well as progress indication and how to min/max changes due to progress, then it might be possible to change the problem into a scheduling problem.. But given how difficult scheduling is already, I'm not sure if the extra complexity really serves anyone.

Perhaps allowing over commitments, but keeping a track of it and when used memory gets too close to max available you need to start to lessen scheduling and allowing individual processes to complete to free their memory until you get below some threshold and then go back into normal over commit allowed status.

Brendan · Post by **Brendan** » Thu Feb 18, 2016 3:45 pm

Hi,

LtG wrote:Most obvious "solution" is to not allow over commitment at all but that wastes resources.

Yes, but...

From this post:
"When the kernel is running out of free physical RAM, it sends a "free some RAM" message (including how critical the situation is) out to processes that have requested it. VFS responds by freeing "less important" cached file data. A file system might do the same for its meta-data cache. Web browser might respond by dumping cached web page resources. DNS system might do the same for it's domain name cache. A word processor might have enough data to allow the user to undo the last 5000 things they've done, and might respond by reducing that so the user can only undo the last 3000 things they've done."

With a system like this; if you don't allow over-commitment you don't waste resources as much - those resources are still being used for caches and other things.

To me; this is the right way to do things - don't allowing over-commit, except for resources that can be revoked.

Of course most existing OSs (and things like POSIX) don't have any way for the kernel to ask for resources back, so they're stuck with the "waste resource or over-commit" compromise.

Cheers,

Brendan

linguofreak · Post by **linguofreak** » Fri Feb 19, 2016 2:35 am

LtG wrote:
linguofreak wrote: The only way to absolutely guarantee it is to have a OOM killer that queries the user for which task should be killed, because only the user knows what is "innocent" and what is "malicious".
I'm not sure how feasible that is, are "normal" end users supposed to know what is malicious and what is innocent? Do they in practice?

OK, so even then it's not an absolute guarantee, but assuming an experienced user who knows what programs they want running, it is. In any case, the claim is that it is the only way to guarantee that innocent processes survive and malicious ones die, not that it is feasible (at least in all cases).

Also, asking really only works on a interactive desktop type of scenario, or how would you do it on a console/shell/prompt situation? Shells are generally expected to do what you tell them to do, not to come up with their own questions all of a sudden while you're using 'vi' or something.

Actually, doing it on a console is the use case I'd be more confident of being able to do it in. Doing it in a GUI would be much more likely to end up allocating memory in non-obvious ways, while our theoretical user-querying OOM killer has to be able to do its job without allocating memory. Ideally, its memory usage would be O(1) in the number of processes. If it used O(f(n)) memory (for an arbitrary function f(n) and number of processes n), it would have to receive memory for its recordkeeping on a process as the first step the system took in starting that process.

As for system messages coming up while you're working in a terminal, Linux will print all sorts of errors to the system console regardless of what you may be doing on that virtual terminal. My box is currently spewing errors about a failed CD drive that I haven't had the time to open up the machine to disconnect. An OOM situation is next thing to a kernel-panic / bluescreen, both of which will happily intrude while you're doing other things, so I don't see any big problem with the OOM killer doing the same thing.

What about servers? The system freezes, waiting for user to answer what to kill, but server admins don't typically look at each server every minute of the day, so it might take quite a while before someone realizes to check the server.

The OS doesn't need to totally grind to a halt. It will have to stall on outstanding allocations until the user makes a decision or a process ends on its own or otherwise returns memory to the OS, but processes that aren't actively allocating (or are allocating from free chunks in their own heaps rather than going to the OS for memory) can continue running. Now, if a mission-critical process ends up blocking on an allocation, yes, you have a problem, and a user OOM killer might not be appropriate for situations where this is likely to cause more trouble than an OOM situation generally does in the first place.

Of course SSH into the server might no longer work because SSH too needs memory, so the moment it tries to spawn it gets frozen due to OOM condition. So you basically get yourself in a chicken and egg situation.

For a server, your OOM killer could actually make use of a reverse-SSH protocol where the OOM killer makes an *outbound* ssh-like connection, using pre-allocated memory, to a machine running server management software, which could then send alerts to admin's cell phones, take an inbound SSH connection from an administrator's workstation (or phone), and pass input from the administrator to the OOMed server and output from the server back to the administrator.

In extreme conditions the user might not even be able to open the task manager to check what to kill because of the chicken-egg.[/qujote]

This can be solved by the OOM killer presenting its own task list kept in pre-allocated memory.

[qoute]I suppose if each process had min/max (like in the ELF header) as well as progress indication and how to min/max changes due to progress, then it might be possible to change the problem into a scheduling problem.. But given how difficult scheduling is already, I'm not sure if the extra complexity really serves anyone.

Not every process has bounded memory requirements. Not every process has bounded runtime. Daemons are generally supposed to keep running indefinitely.

One useful feature would be a facility for processes to provide the kernel with a list of free pages in their heaps so that the kernel could reclaim those pages if needed.

Perhaps allowing over commitments, but keeping a track of it and when used memory gets too close to max available you need to start to lessen scheduling and allowing individual processes to complete to free their memory until you get below some threshold and then go back into normal over commit allowed status.

manoj9372 · Post by **manoj9372** » Fri Feb 19, 2016 3:26 am

Code: Select all

Basicly the person who programmed the code running in that process decided it needs "x" amount of memory.

Brendan explained (very detailed) how memory allocation itself works. Like how a car is able to move. But asking on what basis the process knows that it needs "x" amount of memory is like me asking you on what basis you know where to drive your car.

I didn't asked that question in a bad sense,i just wanted understand how a programmer calculates the MEMORY REQUIREMENTS for a process ?
that's what my question is,kindly don't take it in bad manner...

FallenAvatar · Post by **FallenAvatar** » Fri Feb 19, 2016 3:51 am

manoj9372 wrote:I didn't asked that question in a bad sense,i just wanted understand how a programmer calculates the MEMORY REQUIREMENTS for a process ?
that's what my question is,kindly don't take it in bad manner...

That is a basic programming question. One better asked in a basic C or C++ Programming forum.

- Monk

manoj9372 · Post by **manoj9372** » Fri Feb 19, 2016 3:59 am

Code: Select all

That is a basic programming question. One better asked in a basic C or C++ Programming forum.
- Monk

let's say you have been given a task that you need to create a small application like a "strong password generator" in "C" that can generate a password with a maximum length of up to 32 characters/no's/symbols/digits .

you need to pre-calculate the memory requirements for this process and write in the code that this process needs "x" amount of memory,how would you calculate the memory requirements for this process and specify in the code that you need "x" amount of memory ?

that is my question..

and by the by i am a beginner

FallenAvatar · Post by **FallenAvatar** » Fri Feb 19, 2016 4:20 am

manoj9372 wrote: let's say you have been given a task that you need to create a small application like a "strong password generator" in "C" that can generate a password with a maximum length of up to 32 characters/no's/symbols/digits .

you need to pre-calculate the memory requirements for this process and write in the code that this process needs "x" amount of memory,how would you calculate the memory requirements for this process and specify in the code that you need "x" amount of memory ?

that is my question..

and by the by i am a beginner

1) That is still a basic programming question, and there for does not belong here.
2) Not enough info provided to accomplish what you are asking.
3) See http://wiki.osdev.org/Required_Knowledge.

- Monk

OSDev.org

Understanding memory allocation for a process manually ?

Understanding memory allocation for a process manually ?

Re: Understanding memory allocation for a process manually ?

Re: Understanding memory allocation for a process manually ?

Re: Understanding memory allocation for a process manually ?

Re: Understanding memory allocation for a process manually ?

Re: Understanding memory allocation for a process manually ?

Re: Understanding memory allocation for a process manually ?

Re: Understanding memory allocation for a process manually ?

Re: Understanding memory allocation for a process manually ?

Re: Understanding memory allocation for a process manually ?

Re: Understanding memory allocation for a process manually ?

Re: Understanding memory allocation for a process manually ?

Re: Understanding memory allocation for a process manually ?

Re: Understanding memory allocation for a process manually ?

Re: Understanding memory allocation for a process manually ?