Understanding memory allocation for a process manually ?
Understanding memory allocation for a process manually ?
As the title how can i understand how memory is allocated to a process?
so far i have read the "operating system concepts by abraham silberschatz" i understood what is a process,what are its states,it's representation(i.e process control block),the scheduling algorithms etc..
but i am trying to understand it in a practical way,say for example if i click notepad.exe in windows the system will start a new process,but how exactly the memory(i.e base and limit registers etc..)
are allocated ? on what basis the memory is allocated to a process? are they remain static or dynamic throught the process execution ?
i want to know this process manually for my knowledge,can some mind explain with a small example ?
so far i have read the "operating system concepts by abraham silberschatz" i understood what is a process,what are its states,it's representation(i.e process control block),the scheduling algorithms etc..
but i am trying to understand it in a practical way,say for example if i click notepad.exe in windows the system will start a new process,but how exactly the memory(i.e base and limit registers etc..)
are allocated ? on what basis the memory is allocated to a process? are they remain static or dynamic throught the process execution ?
i want to know this process manually for my knowledge,can some mind explain with a small example ?
Re: Understanding memory allocation for a process manually ?
Hi,
The lowest layer is managing physical memory. This can be done in multiple ways (possibly including using multiple ways at the same time, for different areas of the physical address space). This is typically built into the kernel.
The middle layer is managing virtual memory. This includes creating/destroying virtual address spaces; keeping track of what each areas in a virtual address space is being used for, and mapping/unmapping things into virtual address spaces (physical pages, swap space, memory mapped files, shared memory areas, etc). This is typically built into the kernel.
The highest layer is things like heap (malloc/free, new/delete, garbage collectors, object pools, etc). This is typically built into a language's libraries (e.g. C standard library) or a language's run-time environment (e.g. Java virtual machine) or done by the process itself.
For an example; a program might call "malloc()" to allocate 12345 bytes, which might be implemented in a shared library. The library's "malloc()" might search a linked list for a large enough block of free memory to allocate but find out there isn't one, so it asks kernel's virtual memory manager to increase the size of its pool, which might be like "Dear virtual memory manager, for all virtual pages between 0x12300000 to 0x1240000 in my virtual address space, change their type to the allocate on demand type". The virtual memory manager might respond by changing the requested virtual pages from whatever type they were before (probably the "unused" type) to the new type (the "allocate on demand" type); which might not involve allocating any physical RAM at all (but might involve allocating a physical page to use for a page table). After that; the library's "malloc()" might split the new (65536 byte) area into a 12345 byte area and a 53191 byte area, add the 53191 byte area to its linked list of free blocks for later, and return the virtual address of the 12345 byte area to the program.
When the program writes to a virtual page that has the "allocate on demand" type, it causes a page fault, and the kernel's page fault handler (virtual memory manager) allocates a physical page (from the physical memory manager), maps the physical page into the virtual address space and changes the virtual page's type to "usable RAM", then returns to the program that caused the page fault. The program doesn't know or care that this happened - as far as it knows, the page was always usable RAM. When "malloc()" added the remaining 53191 byte area to its memory pool it might have modified something in one of the virtual pages and caused a physical page to be allocated, but the majority of that 53191 byte area wouldn't be wasting any actual RAM (until it's used later). When the program starts storing data in the 12345 byte area it got from "malloc()" it might cause more physical pages to be allocated.
Of course maybe a program asked "malloc()" to allocate 987 GiB and only used 6 bytes. It doesn't matter (it's not like it'd actually use 987 GB of RAM).
Cheers,
Brendan
Think of it as 3 layers.manoj9372 wrote:i want to know this process manually for my knowledge,can some mind explain with a small example ?
The lowest layer is managing physical memory. This can be done in multiple ways (possibly including using multiple ways at the same time, for different areas of the physical address space). This is typically built into the kernel.
The middle layer is managing virtual memory. This includes creating/destroying virtual address spaces; keeping track of what each areas in a virtual address space is being used for, and mapping/unmapping things into virtual address spaces (physical pages, swap space, memory mapped files, shared memory areas, etc). This is typically built into the kernel.
The highest layer is things like heap (malloc/free, new/delete, garbage collectors, object pools, etc). This is typically built into a language's libraries (e.g. C standard library) or a language's run-time environment (e.g. Java virtual machine) or done by the process itself.
For an example; a program might call "malloc()" to allocate 12345 bytes, which might be implemented in a shared library. The library's "malloc()" might search a linked list for a large enough block of free memory to allocate but find out there isn't one, so it asks kernel's virtual memory manager to increase the size of its pool, which might be like "Dear virtual memory manager, for all virtual pages between 0x12300000 to 0x1240000 in my virtual address space, change their type to the allocate on demand type". The virtual memory manager might respond by changing the requested virtual pages from whatever type they were before (probably the "unused" type) to the new type (the "allocate on demand" type); which might not involve allocating any physical RAM at all (but might involve allocating a physical page to use for a page table). After that; the library's "malloc()" might split the new (65536 byte) area into a 12345 byte area and a 53191 byte area, add the 53191 byte area to its linked list of free blocks for later, and return the virtual address of the 12345 byte area to the program.
When the program writes to a virtual page that has the "allocate on demand" type, it causes a page fault, and the kernel's page fault handler (virtual memory manager) allocates a physical page (from the physical memory manager), maps the physical page into the virtual address space and changes the virtual page's type to "usable RAM", then returns to the program that caused the page fault. The program doesn't know or care that this happened - as far as it knows, the page was always usable RAM. When "malloc()" added the remaining 53191 byte area to its memory pool it might have modified something in one of the virtual pages and caused a physical page to be allocated, but the majority of that 53191 byte area wouldn't be wasting any actual RAM (until it's used later). When the program starts storing data in the 12345 byte area it got from "malloc()" it might cause more physical pages to be allocated.
Of course maybe a program asked "malloc()" to allocate 987 GiB and only used 6 bytes. It doesn't matter (it's not like it'd actually use 987 GB of RAM).
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Understanding memory allocation for a process manually ?
What if there were 100 malicious programs that allocated huge memory areas, and suddenly start actually writing to them? Could you guarantee that a little innocent program that happens to write to its own allocated area survives in the middle of chaos those malicious programs are causing to the system, i.e. running out of physical memory and disk swap space and so on.
Re: Understanding memory allocation for a process manually ?
Hi,
Cheers,
Brendan
You either:Antti wrote:What if there were 100 malicious programs that allocated huge memory areas, and suddenly start actually writing to them? Could you guarantee that a little innocent program that happens to write to its own allocated area survives in the middle of chaos those malicious programs are causing to the system, i.e. running out of physical memory and disk swap space and so on.
- decide not to support "overcommit"; and refuse to do anything that causes "amount of pretend RAM committed to" to exceed "amount of real RAM plus swap space".
- decide to support "overcommit", hope the problem doesn't happen, and provide a way to deal with it if it does happen (e.g. "OOM killer").
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Understanding memory allocation for a process manually ?
Thanks for your detailed explanation,but one thing wasn't clear to me
As you said "a program might call "malloc()" to allocate 12345 bytes",i want to know on what basis the process know that it needs "x" amount of memory ?
that is where i need more light ,can you explain a bit there ?
As you said "a program might call "malloc()" to allocate 12345 bytes",i want to know on what basis the process know that it needs "x" amount of memory ?
that is where i need more light ,can you explain a bit there ?
- Nutterts
- Member
- Posts: 159
- Joined: Wed Aug 05, 2015 5:33 pm
- Libera.chat IRC: Nutterts
- Location: Drenthe, Netherlands
Re: Understanding memory allocation for a process manually ?
Basicly the person who programmed the code running in that process decided it needs "x" amount of memory.manoj9372 wrote:Thanks for your detailed explanation,but one thing wasn't clear to me
As you said "a program might call "malloc()" to allocate 12345 bytes",i want to know on what basis the process know that it needs "x" amount of memory ?
that is where i need more light ,can you explain a bit there ?
Brendan explained (very detailed) how memory allocation itself works. Like how a car is able to move. But asking on what basis the process knows that it needs "x" amount of memory is like me asking you on what basis you know where to drive your car.
"Always code as if the guy who ends up maintaining it will be a violent psychopath who knows where you live." - John F. Woods
Failed project: GoOS - https://github.com/nutterts/GoOS
Failed project: GoOS - https://github.com/nutterts/GoOS
Re: Understanding memory allocation for a process manually ?
Hi,
In either case, the number of bytes to allocate is determined (directly or indirectly) by what the programmer told the compiler they want.
Note that it's entirely possible for a programmer to do it themselves without using whatever the language provides. For example, for the code I'm working on now, it only needs to be able to free the most recently allocated memory (and most of the allocated memory is never freed); so I'm using something like this:
Cheers,
Brendan
When writing software, a programmer often realises they need memory to store something, and adds pieces of code like "temp_triangle = malloc( sizeof(triangle_structure) );" or "person_object = new person(name, age);" where the compiler calculates how large it is and generates code that does something like "malloc(1234);". In some cases the programmer asks for a specific number of bytes, like "char myBuffer = malloc(1234);".manoj9372 wrote:Thanks for your detailed explanation,but one thing wasn't clear to me
As you said "a program might call "malloc()" to allocate 12345 bytes",i want to know on what basis the process know that it needs "x" amount of memory ?
that is where i need more light ,can you explain a bit there ?
In either case, the number of bytes to allocate is determined (directly or indirectly) by what the programmer told the compiler they want.
Note that it's entirely possible for a programmer to do it themselves without using whatever the language provides. For example, for the code I'm working on now, it only needs to be able to free the most recently allocated memory (and most of the allocated memory is never freed); so I'm using something like this:
Code: Select all
void free(address) {
first_unused_byte = address;
}
void * allocate(size) {
size = (size + 7) & 0xFFFFFFF8; // Round size up to keep everything aligned
if( first_unused_byte + size > pool_top) {
pages_wanted = (first_unused_byte + size - first_unallocated_byte + extra_for_next_time) / page_size;
if(alloc_virtual_pages( pool_top, pages_wanted) != OK) return NULL;
pool_top += pages_wanted * page_size;
}
address = first_unused_byte;
first_unused_byte += size;
return address;
}
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
-
- Member
- Posts: 510
- Joined: Wed Mar 09, 2011 3:55 am
Re: Understanding memory allocation for a process manually ?
The only way to absolutely guarantee it is to have a OOM killer that queries the user for which task should be killed, because only the user knows what is "innocent" and what is "malicious".Antti wrote:What if there were 100 malicious programs that allocated huge memory areas, and suddenly start actually writing to them? Could you guarantee that a little innocent program that happens to write to its own allocated area survives in the middle of chaos those malicious programs are causing to the system, i.e. running out of physical memory and disk swap space and so on.
Re: Understanding memory allocation for a process manually ?
I'm not sure how feasible that is, are "normal" end users supposed to know what is malicious and what is innocent? Do they in practice?linguofreak wrote: The only way to absolutely guarantee it is to have a OOM killer that queries the user for which task should be killed, because only the user knows what is "innocent" and what is "malicious".
Also, asking really only works on a interactive desktop type of scenario, or how would you do it on a console/shell/prompt situation? Shells are generally expected to do what you tell them to do, not to come up with their own questions all of a sudden while you're using 'vi' or something.
What about servers? The system freezes, waiting for user to answer what to kill, but server admins don't typically look at each server every minute of the day, so it might take quite a while before someone realizes to check the server. Of course SSH into the server might no longer work because SSH too needs memory, so the moment it tries to spawn it gets frozen due to OOM condition. So you basically get yourself in a chicken and egg situation.
In extreme conditions the user might not even be able to open the task manager to check what to kill because of the chicken-egg.
Most obvious "solution" is to not allow over commitment at all but that wastes resources. I tried thinking what solutions are used in real world, like air lines which over commit (over booking), but for them the solution is a lot easier because they can just postpone flights for some, since each flyer is "independent" from each other and thus can be individually postponed. The same doesn't work for two reasons:
1) Figuring out which process would fit in the memory (on the flight) is more difficult because all of the memory for a process is needed at the same time, you can't divide it. So smaller programs, possibly with memory restrictions (like the ELF header stating max memory) would help here.
2) Worst case, there's a 1000t person, that person will never fit a flight, similarly if you have 1GB RAM and 1GB swap, any process that will need 3GB of memory will never fit.
So at least with air lines I can't think of how real world solutions would help..
I suppose if each process had min/max (like in the ELF header) as well as progress indication and how to min/max changes due to progress, then it might be possible to change the problem into a scheduling problem.. But given how difficult scheduling is already, I'm not sure if the extra complexity really serves anyone.
Perhaps allowing over commitments, but keeping a track of it and when used memory gets too close to max available you need to start to lessen scheduling and allowing individual processes to complete to free their memory until you get below some threshold and then go back into normal over commit allowed status.
Re: Understanding memory allocation for a process manually ?
Hi,
From this post:
"When the kernel is running out of free physical RAM, it sends a "free some RAM" message (including how critical the situation is) out to processes that have requested it. VFS responds by freeing "less important" cached file data. A file system might do the same for its meta-data cache. Web browser might respond by dumping cached web page resources. DNS system might do the same for it's domain name cache. A word processor might have enough data to allow the user to undo the last 5000 things they've done, and might respond by reducing that so the user can only undo the last 3000 things they've done."
With a system like this; if you don't allow over-commitment you don't waste resources as much - those resources are still being used for caches and other things.
To me; this is the right way to do things - don't allowing over-commit, except for resources that can be revoked.
Of course most existing OSs (and things like POSIX) don't have any way for the kernel to ask for resources back, so they're stuck with the "waste resource or over-commit" compromise.
Cheers,
Brendan
Yes, but...LtG wrote:Most obvious "solution" is to not allow over commitment at all but that wastes resources.
From this post:
"When the kernel is running out of free physical RAM, it sends a "free some RAM" message (including how critical the situation is) out to processes that have requested it. VFS responds by freeing "less important" cached file data. A file system might do the same for its meta-data cache. Web browser might respond by dumping cached web page resources. DNS system might do the same for it's domain name cache. A word processor might have enough data to allow the user to undo the last 5000 things they've done, and might respond by reducing that so the user can only undo the last 3000 things they've done."
With a system like this; if you don't allow over-commitment you don't waste resources as much - those resources are still being used for caches and other things.
To me; this is the right way to do things - don't allowing over-commit, except for resources that can be revoked.
Of course most existing OSs (and things like POSIX) don't have any way for the kernel to ask for resources back, so they're stuck with the "waste resource or over-commit" compromise.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
-
- Member
- Posts: 510
- Joined: Wed Mar 09, 2011 3:55 am
Re: Understanding memory allocation for a process manually ?
OK, so even then it's not an absolute guarantee, but assuming an experienced user who knows what programs they want running, it is. In any case, the claim is that it is the only way to guarantee that innocent processes survive and malicious ones die, not that it is feasible (at least in all cases).LtG wrote:I'm not sure how feasible that is, are "normal" end users supposed to know what is malicious and what is innocent? Do they in practice?linguofreak wrote: The only way to absolutely guarantee it is to have a OOM killer that queries the user for which task should be killed, because only the user knows what is "innocent" and what is "malicious".
Actually, doing it on a console is the use case I'd be more confident of being able to do it in. Doing it in a GUI would be much more likely to end up allocating memory in non-obvious ways, while our theoretical user-querying OOM killer has to be able to do its job without allocating memory. Ideally, its memory usage would be O(1) in the number of processes. If it used O(f(n)) memory (for an arbitrary function f(n) and number of processes n), it would have to receive memory for its recordkeeping on a process as the first step the system took in starting that process.Also, asking really only works on a interactive desktop type of scenario, or how would you do it on a console/shell/prompt situation? Shells are generally expected to do what you tell them to do, not to come up with their own questions all of a sudden while you're using 'vi' or something.
As for system messages coming up while you're working in a terminal, Linux will print all sorts of errors to the system console regardless of what you may be doing on that virtual terminal. My box is currently spewing errors about a failed CD drive that I haven't had the time to open up the machine to disconnect. An OOM situation is next thing to a kernel-panic / bluescreen, both of which will happily intrude while you're doing other things, so I don't see any big problem with the OOM killer doing the same thing.
The OS doesn't need to totally grind to a halt. It will have to stall on outstanding allocations until the user makes a decision or a process ends on its own or otherwise returns memory to the OS, but processes that aren't actively allocating (or are allocating from free chunks in their own heaps rather than going to the OS for memory) can continue running. Now, if a mission-critical process ends up blocking on an allocation, yes, you have a problem, and a user OOM killer might not be appropriate for situations where this is likely to cause more trouble than an OOM situation generally does in the first place.What about servers? The system freezes, waiting for user to answer what to kill, but server admins don't typically look at each server every minute of the day, so it might take quite a while before someone realizes to check the server.
For a server, your OOM killer could actually make use of a reverse-SSH protocol where the OOM killer makes an *outbound* ssh-like connection, using pre-allocated memory, to a machine running server management software, which could then send alerts to admin's cell phones, take an inbound SSH connection from an administrator's workstation (or phone), and pass input from the administrator to the OOMed server and output from the server back to the administrator.Of course SSH into the server might no longer work because SSH too needs memory, so the moment it tries to spawn it gets frozen due to OOM condition. So you basically get yourself in a chicken and egg situation.
Not every process has bounded memory requirements. Not every process has bounded runtime. Daemons are generally supposed to keep running indefinitely.In extreme conditions the user might not even be able to open the task manager to check what to kill because of the chicken-egg.[/qujote]
This can be solved by the OOM killer presenting its own task list kept in pre-allocated memory.
[qoute]I suppose if each process had min/max (like in the ELF header) as well as progress indication and how to min/max changes due to progress, then it might be possible to change the problem into a scheduling problem.. But given how difficult scheduling is already, I'm not sure if the extra complexity really serves anyone.
One useful feature would be a facility for processes to provide the kernel with a list of free pages in their heaps so that the kernel could reclaim those pages if needed.
Perhaps allowing over commitments, but keeping a track of it and when used memory gets too close to max available you need to start to lessen scheduling and allowing individual processes to complete to free their memory until you get below some threshold and then go back into normal over commit allowed status.
Re: Understanding memory allocation for a process manually ?
Code: Select all
Basicly the person who programmed the code running in that process decided it needs "x" amount of memory.
Brendan explained (very detailed) how memory allocation itself works. Like how a car is able to move. But asking on what basis the process knows that it needs "x" amount of memory is like me asking you on what basis you know where to drive your car.
that's what my question is,kindly don't take it in bad manner...
-
- Member
- Posts: 283
- Joined: Mon Jan 03, 2011 6:58 pm
Re: Understanding memory allocation for a process manually ?
That is a basic programming question. One better asked in a basic C or C++ Programming forum.manoj9372 wrote:I didn't asked that question in a bad sense,i just wanted understand how a programmer calculates the MEMORY REQUIREMENTS for a process ?
that's what my question is,kindly don't take it in bad manner...
- Monk
Re: Understanding memory allocation for a process manually ?
Code: Select all
That is a basic programming question. One better asked in a basic C or C++ Programming forum.
- Monk
let's say you have been given a task that you need to create a small application like a "strong password generator" in "C" that can generate a password with a maximum length of up to 32 characters/no's/symbols/digits .
you need to pre-calculate the memory requirements for this process and write in the code that this process needs "x" amount of memory,how would you calculate the memory requirements for this process and specify in the code that you need "x" amount of memory ?
that is my question..
and by the by i am a beginner
-
- Member
- Posts: 283
- Joined: Mon Jan 03, 2011 6:58 pm
Re: Understanding memory allocation for a process manually ?
1) That is still a basic programming question, and there for does not belong here.manoj9372 wrote: let's say you have been given a task that you need to create a small application like a "strong password generator" in "C" that can generate a password with a maximum length of up to 32 characters/no's/symbols/digits .
you need to pre-calculate the memory requirements for this process and write in the code that this process needs "x" amount of memory,how would you calculate the memory requirements for this process and specify in the code that you need "x" amount of memory ?
that is my question..
and by the by i am a beginner
2) Not enough info provided to accomplish what you are asking.
3) See http://wiki.osdev.org/Required_Knowledge.
- Monk