Hi:
I am trying to understand how the OS catches all illegal memory access in a system which uses Paging. (32 bits, x86, Paging enabled).
To be more specific, let's suppose I have a tiny App which is just 1 Page in size. Considering that a MS OS take the upper half of the 'virtual memory address space' and that my tiny EXE occupies just 4k of lower half of VMAS, then:
1) How OS realizes that there is an 'illegal memory reference/access' going on when my code tries to write to a memory location outside from my own Exe's 4k? (Obviously, that pointer wasn't obtained from a 'malloc' or similar call).
2) How are Page Tables managed for that tiny Exe? Does OS have to define all 1 M Page Entries (-1 Page Entry) with a 'Non-Present' attribute set and 'System' owned? (When that 'process' is created).
Any advice or comment is wellcome.
How OS catches illegal memory references at paging scheme?
- sleephacker
- Member
- Posts: 97
- Joined: Thu Aug 06, 2015 6:41 am
- Location: Netherlands
Re: How OS catches illegal memory references at paging schem
By setting the right flags in the pagetable/pagedirectory entries, the OS can define which regions the app isn't allowed to access. If the app does try to access any kernel or non-present memory this will generate a Page Fault which will cause an interrupt which the OS can then handle. From the error code on the stack (and scheduler information if you're multitasking) the OS knows the Page Fault was generated by an application and it also knows which application, so you can then choose how to handle it (probably just shut it down).1) How OS realizes that there is an 'illegal memory reference/access' going on when my code tries to write to a memory location outside from my own Exe's 4k? (Obviously, that pointer wasn't obtained from a 'malloc' or similar call).
This depends on how you write your OS, but the most memory efficient way is to have one Page Directory per process (or per virtual address space), and set any directory entry that is not used by your kernel or your application to non-present, thus avoiding the need to fill in all 1 M table entries. In this case you would need 1 Page Directory for every process, plus one or more (in the case of tinyApp just one) Page Table per process, plus some page tables for the kernel (these can be the same in every address space/process).2) How are Page Tables managed for that tiny Exe? Does OS have to define all 1 M Page Entries (-1 Page Entry) with a 'Non-Present' attribute set and 'System' owned? (When that 'process' is created).
Re: How OS catches illegal memory references at paging schem
Thanx sleephacker.
Some comments about your answers (assuming 32bit, x86):
1) So, in order to catch an illegal reference for 'unallocated' memory, the VMAS for the 'App' should be marked as 'User' & 'Non-Present' and the rest of the VMAS should be marked as 'Kernel' & 'Non-Present'. That it makes sense.
2) So, it is enough to define the unique 4KB of Page Directory Entries and some Page Table Entries too with the proper attributes as mentioned in 1). That makes sense too.
Again thanx...
Some comments about your answers (assuming 32bit, x86):
1) So, in order to catch an illegal reference for 'unallocated' memory, the VMAS for the 'App' should be marked as 'User' & 'Non-Present' and the rest of the VMAS should be marked as 'Kernel' & 'Non-Present'. That it makes sense.
2) So, it is enough to define the unique 4KB of Page Directory Entries and some Page Table Entries too with the proper attributes as mentioned in 1). That makes sense too.
Again thanx...
- sleephacker
- Member
- Posts: 97
- Joined: Thu Aug 06, 2015 6:41 am
- Location: Netherlands
Re: How OS catches illegal memory references at paging schem
Yes, but not all of it should be marked as non-present, for example the pages used by the kernel should be marked as present, and the pages used by the application too.fante wrote:1) So, in order to catch an illegal reference for 'unallocated' memory, the VMAS for the 'App' should be marked as 'User' & 'Non-Present' and the rest of the VMAS should be marked as 'Kernel' & 'Non-Present'
Yes.fante wrote:2) So, it is enough to define the unique 4KB of Page Directory Entries and some Page Table Entries too with the proper attributes as mentioned in 1)
So, for example, if you have a kernel loaded at physical address 0x00100000 that takes up 16KB of memory and you want it mapped to virtual address 0x8000000, and you also have an application loaded at 0x00200000 that takes up 4KB and you want to map it to 0x00000000, this is what your paging structures will look like:
Page Directory
- Entry #0: User, Present, pointer to applications Page Table
- Entries #1 ... #510: User/Kernel (doesn't matter because they aren't present), Non-Present
- Entry #511: Kernel, Present, pointer to kernel Page Table
- Entries #512 ... #4095: User/Kernel (again doesn't matter), Non-Present
- Entry #0: User, Present, pointer to 0x00200000
- Entries #1 ... 4095: Non-Present
Kernel Page Table
- Entry #0: Kernel, Present, pointer to 0x00100000
- Entry #1: Kernel, Present, pointer to 0x00101000
- Entry #2: Kernel, Present, pointer to 0x00102000
- Entry #3: Kernel, Present, pointer to 0x00103000
- Entries #4 ... 4095: Non-Present
Re: How OS catches illegal memory references at paging schem
Hi, sleephacker:
That is a pretty helpful 'dirty hands' example!. All fits superb except the 'Entries' going from 1 to 4096 instead of 1 to 1024. But that is minor error. Thanx again.
Now thinking about it, a little doubt came along...
Suppose during the App execution a random pointer tries to access a foreign memory cell.. As you explained before, a Page Fault handler will take the chance to execute due to the Non-Present attribute of that related memory page.
How does the Page Fault handler get to know that Non-Present, User page is due to an unallocated page or a swapped page?
I am assuming the Non-Present pages were marked as 'User'. Also I am assuming that my App is bigger than 4KB with only 1 Page Present in memory due to RAM memory constraints for the moment (demand paging).
Is it because there should be some other attribute at the Non-Present Page Entry (like Allocated/Unallocated)? If you could tell how this issue is done with popular OS's or one you know best it will be helpful.
That is a pretty helpful 'dirty hands' example!. All fits superb except the 'Entries' going from 1 to 4096 instead of 1 to 1024. But that is minor error. Thanx again.
Now thinking about it, a little doubt came along...
Suppose during the App execution a random pointer tries to access a foreign memory cell.. As you explained before, a Page Fault handler will take the chance to execute due to the Non-Present attribute of that related memory page.
How does the Page Fault handler get to know that Non-Present, User page is due to an unallocated page or a swapped page?
I am assuming the Non-Present pages were marked as 'User'. Also I am assuming that my App is bigger than 4KB with only 1 Page Present in memory due to RAM memory constraints for the moment (demand paging).
Is it because there should be some other attribute at the Non-Present Page Entry (like Allocated/Unallocated)? If you could tell how this issue is done with popular OS's or one you know best it will be helpful.
- sleephacker
- Member
- Posts: 97
- Joined: Thu Aug 06, 2015 6:41 am
- Location: Netherlands
Re: How OS catches illegal memory references at paging schem
Woops. Actually I prefer counting from 0 to 1023.fante wrote:All fits superb except the 'Entries' going from 1 to 4096 instead of 1 to 1024
Depends on the OS, one way to do it is to store this information in the process' structure, or you could set a flag in the page table entry (AFAIK the CPU doesn't care about any bit other than the present bit if the entry is set to non-present).fante wrote:How does the Page Fault handler get to know that Non-Present, User page is due to an unallocated page or a swapped page?
Re: How OS catches illegal memory references at paging schem
Hi,
For each "present" page I figure out (enumerate) all the different types that the page could be. Example:
For each "not present" page it's similar - figure out (enumerate) all the different types that the page could be. However, in this case there are many (31 or more depending on what sort of paging you're using) "available for OS use" flags (all bits except the "present" flag itself is an "available flag"). This means that you can split the (31-bit or larger) value it into ranges, where it might end up being vaguely like:
Also note that the same "enumerate types and encode type in available bits" approach can be used for page directory entries (and page directory pointer table entries, and so on). This means that you can have (e.g.) a 4 MiB area marked as "not present, part of memory mapped file" without needing to allocate/use all the page tables.
Mostly (with a little cleverness); the page fault handler can figure out exactly what it needs to do in every possible case; by using the paging structures (page tables, page directories, etc) themselves (which are mostly unavoidable), plus some kind of list of "memory mapped file area" structures, plus something that manages swap space that remembers which pages in swap space are used for what (for the "present page where copy of the page already exists in swap space" case only).
Note that for my OS I support "area is part of a memory mapped file where all pages are in VFS cache"; but don't support memory mapped files where the pages are on disk (it's bad for fault tolerance in the "page fault handler gets read error from disk drive" case), don't support shared memory between processes (it's bad for performance for distributed systems) and don't support "fork-style copy on write" (also bad for performance for distributed systems). This tends to make virtual memory management less complicated; partly because reference counting (tracking how many processes are using a page) is only needed for the "area is part of a memory mapped file where all pages are in VFS cache" case and can therefore be delegated (done by VFS and not done by virtual memory management).
Cheers,
Brendan
Everyone does this differently. I'll describe the way I do it..fante wrote:How does the Page Fault handler get to know that Non-Present, User page is due to an unallocated page or a swapped page?
I am assuming the Non-Present pages were marked as 'User'. Also I am assuming that my App is bigger than 4KB with only 1 Page Present in memory due to RAM memory constraints for the moment (demand paging).
Is it because there should be some other attribute at the Non-Present Page Entry (like Allocated/Unallocated)? If you could tell how this issue is done with popular OS's or one you know best it will be helpful.
For each "present" page I figure out (enumerate) all the different types that the page could be. Example:
- present normal RAM
- present and "pinned" so it can't be sent to swap space (e.g. part of disk driver needed to access swap space)
- present and part of a memory mapped device (where the page should never be "freed" in the normal way)
- present but part of allocate on write area
- present but part of a copy on write area
- present but part of a memory mapped file
- present and a copy of the page already exists in swap space
For each "not present" page it's similar - figure out (enumerate) all the different types that the page could be. However, in this case there are many (31 or more depending on what sort of paging you're using) "available for OS use" flags (all bits except the "present" flag itself is an "available flag"). This means that you can split the (31-bit or larger) value it into ranges, where it might end up being vaguely like:
- 0x00000000 = not present, not allocated page
0x00000001 to 0x0000000F = reserved and/or other stuff
0x00000010 to 0x0FFFFFFF = not present, part of memory mapped file, entry number in list of "memory mapped file area" structures is "value - 0x00000010"
0x10000000 to 0x7FFFFFFF = not present read/write normal page stored in swap space, location in swap space is "value - 0x10000000"
Also note that the same "enumerate types and encode type in available bits" approach can be used for page directory entries (and page directory pointer table entries, and so on). This means that you can have (e.g.) a 4 MiB area marked as "not present, part of memory mapped file" without needing to allocate/use all the page tables.
Mostly (with a little cleverness); the page fault handler can figure out exactly what it needs to do in every possible case; by using the paging structures (page tables, page directories, etc) themselves (which are mostly unavoidable), plus some kind of list of "memory mapped file area" structures, plus something that manages swap space that remembers which pages in swap space are used for what (for the "present page where copy of the page already exists in swap space" case only).
Note that for my OS I support "area is part of a memory mapped file where all pages are in VFS cache"; but don't support memory mapped files where the pages are on disk (it's bad for fault tolerance in the "page fault handler gets read error from disk drive" case), don't support shared memory between processes (it's bad for performance for distributed systems) and don't support "fork-style copy on write" (also bad for performance for distributed systems). This tends to make virtual memory management less complicated; partly because reference counting (tracking how many processes are using a page) is only needed for the "area is part of a memory mapped file where all pages are in VFS cache" case and can therefore be delegated (done by VFS and not done by virtual memory management).
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: How OS catches illegal memory references at paging schem
Thanx sleephacker and Brendan for those clear and sound insights.