Page structures

Luis · Post by **Luis** » Sun Apr 28, 2013 4:51 pm

Hi,

I am rewriting my virtual memory manager and I am thinking about the advantages and disadvantages of having the page structures of all the processes together in a global address space or each process page structure in it's own address space. I am using 2 MiB page size and each table as you all know is 4 KiB so I think I should use a global address space to keep all processes page structures because if I keep each structure in it's own address space there would be a big waste of memory for small processes with only a few tables. Despite all that I am afraid I might face problems later in development concerning performance and security. What sort of problems am I running into?

Your thoughts much appreciated,
Luís

Brendan · Post by **Brendan** » Sun Apr 28, 2013 7:31 pm

Hi,

Luis wrote:I am rewriting my virtual memory manager and I am thinking about the advantages and disadvantages of having the page structures of all the processes together in a global address space or each process page structure in it's own address space. I am using 2 MiB page size and each table as you all know is 4 KiB so I think I should use a global address space to keep all processes page structures because if I keep each structure in it's own address space there would be a big waste of memory for small processes with only a few tables. Despite all that I am afraid I might face problems later in development concerning performance and security. What sort of problems am I running into?

None of this costs you (much) memory because you have to allocate the RAM for the paging structures regardless of how many times that RAM is mapped where. What it does cost you is space.

Now; I'm not too sure what sort of paging you're talking about - for long mode there's so much space that it's hard to worry about it, but with "32-bit with PAE" (which I suspect you're using due to your previous posts) it could matter a lot. For example, imagine if you're running 128 processes on a 32-bit kernel (where user space is 3 GiB and kernel space is 1 GiB) and you need to reserve space for each process' paging structures. For PAE you'd need to reserve 6 MiB for each process, so 128 processes adds up to 768 MiB, which means it'd consume 75% of kernel space. Under the same conditions but with paging structures mapped into the process's space instead, it'd cost 6 MiB of each process - each process still has 3066 MiB of space to use and the kernel still have 1 GiB of space to use.

So the question is; under which scenarios might the kernel need to be able to access the paging structures for a process that isn't currently running? If you create a list of them (most likely including things like "copy on write" and swap space management), you can create a work-around for everything on that list.

Cheers,

Brendan

Luis · Post by **Luis** » Sun Apr 28, 2013 8:15 pm

Hi,

Sorry I did not mention I am using long mode and to make myself more clear I will draw the issue that I am talking about. Kernel/user space balance is not an issue because it grows/shrinks dynamically, I have not implemented swap yet and the kernel can either way access paging structures of processes with all threads blocked (to cleanup/close the process for example). The main reason I am considering the global pages for structures approach is ram usage.

----------------
| process 1 |
|2 MiB page| -> process 1 PML4, PDPT, etc
----------------
| process 2 |
|2 MiB page| -> process 2 PML4, PDPT, etc
----------------
| . . . . . . . |
| . . . . . . . |
| . . . . . . . |
| . . . . . . . |
| . . . . . . . |
| . . . . . . . |
| . . . . . . . |
| . . . . . . . |
----------------
Total ram used: 4 MiB

----------------
| . global . |
|2 MiB page| -> process 1 + process 2 PML4, PDPT, etc
----------------
| . . . . . . . |
| . . . . . . . |
| . . . . . . . |
| . . . . . . . |
| . . . . . . . |
| . . . . . . . |
| . . . . . . . |
| . . . . . . . |
| . . . . . . . |
----------------
Toral ram used: 2 MiB

Now, I am not sure if it will work in the long run and what sort of other problems it might cause me. To my eyes this approach looks like a hack

Mikemk · Post by **Mikemk** » Sun Apr 28, 2013 8:27 pm

Luis wrote:To my eyes this approach looks like a hack

All good things start out as a hack by somebody who thinks their way is better, and then companies implement those hacks into the official feature list.
Don't believe me? Look at multitasking operating systems, software themes, difficulty levels and saving in video games, scoring in video games, Lamborghinis, etc, etc.

In my opinion, a well designed hack is far better than a badly designed feature.

Brendan · Post by **Brendan** » Sun Apr 28, 2013 8:53 pm

Hi,

Luis wrote:The main reason I am considering the global pages for structures approach is ram usage.

Um?

You wouldn't allocate a 2 MiB page of RAM and map that into the virtual address space regardless of whether you need it or not, then use the "pre-allocated for no reason" RAM for page tables, etc. Instead, you'd allocate RAM for page tables, etc when the paging structures are actually needed, and map what you had to allocate for paging structures anyway into your virtual address space.

It shouldn't cost RAM or effect RAM usage. It should only cost space.

For example; for long mode, if a process can use the full 131072 GiB of user space (like it should), you would need to reserve 256 GiB of space to map the process' paging structures, but it will costs 0 bytes of RAM to do this.

Also note that if you only reserve 2 MiB of space for a process' paging structures, then that process will only be able to use 1 GiB of user space. In this case there isn't much point bothering with long mode (most 32-bit OSs let processes use more space than that); but it still costs you 0 bytes of RAM to create the mapping.

Cheers,

Brendan

linguofreak · Post by **linguofreak** » Mon Apr 29, 2013 1:32 am

Brendan wrote: Also note that if you only reserve 2 MiB of space for a process' paging structures, then that process will only be able to use 1 GiB of user space. In this case there isn't much point bothering with long mode (most 32-bit OSs let processes use more space than that); but it still costs you 0 bytes of RAM to create the mapping.

He's using 2 MiB pages exclusively, so he actually gets 512 GiB. This is also why he's so worried about memory usage with multiple processes running: Just to set up a PML4 table, he has to reserve 2 MiB of physical memory for that process instead of just 4 KiB. (Of course, after that, the process can use 512 GiB before it needs to allocate anything more for paging structures).

Then as soon as any userspace code or data is loaded the process eats another 2 MiB.

@Luis:
Since you're operating in long mode, you won't even have segmentation to offer protection, so you'll get all the stability and security problems that DOS had (well, not all. While userspace programs won't be protected from each other, the kernel will be protected from userspace).

The real solution here is not to use 2 MiB pages exclusively.

FallenAvatar · Post by **FallenAvatar** » Mon Apr 29, 2013 1:42 am

My take on this, given my lack of complete understanding, would be to use 2MB pages only where possible, ie during program load an initialization, then using smaller pages there after.

- Monk

Luis · Post by **Luis** » Mon Apr 29, 2013 5:42 am

Hi,

linguofreak wrote:He's using 2 MiB pages exclusively, so he actually gets 512 GiB. This is also why he's so worried about memory usage with multiple processes running: Just to set up a PML4 table, he has to reserve 2 MiB of physical memory for that process instead of just 4 KiB. (Of course, after that, the process can use 512 GiB before it needs to allocate anything more for paging structures).

Then as soon as any userspace code or data is loaded the process eats another 2 MiB.

Exactly.

linguofreak wrote:@Luis:
Since you're operating in long mode, you won't even have segmentation to offer protection, so you'll get all the stability and security problems that DOS had (well, not all. While userspace programs won't be protected from each other, the kernel will be protected from userspace).

Since the paging structures will have privilege 0 and will only be touched by the memory manager I believe that the memory manager could manage multiple processes tables together. I guess I need to be extremely careful because 1 little bug whatsoever will break all processes. If there is no other issue then I guess it is the solution that will fit my kernel the best, although it is unconventional at best.

Brendan · Post by **Brendan** » Mon Apr 29, 2013 5:51 am

Hi,

linguofreak wrote:
Brendan wrote: Also note that if you only reserve 2 MiB of space for a process' paging structures, then that process will only be able to use 1 GiB of user space. In this case there isn't much point bothering with long mode (most 32-bit OSs let processes use more space than that); but it still costs you 0 bytes of RAM to create the mapping.
He's using 2 MiB pages exclusively, so he actually gets 512 GiB. This is also why he's so worried about memory usage with multiple processes running: Just to set up a PML4 table, he has to reserve 2 MiB of physical memory for that process instead of just 4 KiB. (Of course, after that, the process can use 512 GiB before it needs to allocate anything more for paging structures).

Ah, that changes things.

In this case, you can't have a 2 MiB PLM4, PDPT or page directory and they have to be 4 KiB. A process would have up to 256 PDPTs; and by pretending that these PDPTs are page tables you can map the paging structures into kernel space (which will cost zero extra bytes because the PDPTs that were allocated had to be allocated anyway). For this mapping, the CPU will think the (up to 256) PDPTs are page tables and also think that the (up to 131072) PDs are normal pages; and the mapping itself will cost a total of 512 MiB of space and cost 0 bytes of RAM.

If a process is limited to 512 GiB of space (no more than one PDPT per process); then you'd still pretend that this PDPT is a page table to create the mapping; the CPU will think the PDPT is a page table and also think that the (up to 512) PDs are normal 4 KiB pages; and the mapping itself will cost a total of 2 MiB of space and still cost 0 bytes of RAM.

If a process is limited to 512 GiB of space and you pre-allocate all page directories that it could possibly want even though it's unlikely to want all of them; then the paging structures for each process will cost 2 MiB of RAM, but this has nothing to do with mapping the paging structures into the virtual address space. Basically, you'd consume 2 MiB of RAM for the process for no reason; and then the mapping will still cost 0 bytes of RAM.

In all of these cases the mapping costs 0 bytes of RAM; regardless of whether all the mappings (for all processes) are in kernel space or if each mapping is in its own process' user space.

Cheers,

Brendan

Luis · Post by **Luis** » Mon Apr 29, 2013 6:57 am

Luis wrote:Hi,

linguofreak wrote:He's using 2 MiB pages exclusively, so he actually gets 512 GiB. This is also why he's so worried about memory usage with multiple processes running: Just to set up a PML4 table, he has to reserve 2 MiB of physical memory for that process instead of just 4 KiB. (Of course, after that, the process can use 512 GiB before it needs to allocate anything more for paging structures).

Then as soon as any userspace code or data is loaded the process eats another 2 MiB.
Exactly.

Oops, sorry I misunderstood. After 512 4 KiB tables are stored in that 2 MiB another 2 MiB page would be allocated so the limit is the architecture limit.

@Brendan
Sorry, my brain must be fried today. I still cannot see how can I store a PML4 or a PDPT or a PD without costing me 4 KiB somewhere.

Brendan · Post by **Brendan** » Mon Apr 29, 2013 7:22 am

Hi,

Luis wrote:Sorry, my brain must be fried today. I still cannot see how can I store a PML4 or a PDPT or a PD without costing me 4 KiB somewhere.

Imagine if you've got one physical page full of zeros. You could use that physical page full of zeros as:

a virtual page; creating a 4 KiB area of the virtual address space full of zeros
a page table; creating a 2 MiB "not present" area of the virtual address space
a page directory; creating a 1 GiB "not present" area of the virtual address space
a page directory pointer table; creating a 512 GiB "not present" area of the virtual address space

..and, you could use that one physical page for all of these things at the same time.

Now, what if that physical page wasn't full of zeros, but was full of valid page directory pointer table entries instead? You could use that physical page full of page directory pointer table entries as:

a virtual page; creating a 4 KiB area full of page directory pointer table entries
a page table; creating a 2 MiB area full of page directory entries
a page directory; creating a 1 GiB area of page table entries
a page directory pointer table; creating a 512 GiB area of normal data

..and, you could use that one physical page for all of these things at the same time.

Cheers,

Brendan

Luis · Post by **Luis** » Mon Apr 29, 2013 10:05 am

Hi,

Now I see what you mean. Pointing a table to itself. After I realize that I did some digging and found an article (Page Tables) in the wiki about recursive paging. I have read all the wiki articles and Intel manual and dozens of other documents but for some reason I missed that article with much relevance to me. Anyway thank you very much Brendan. If you haven't brought recursive paging up I might have never found it until I was way too deep in trouble. Besides your posts help a lot to understand the wiki article by showing the big picture of the mechanics of recursive paging.

Thank you all,
Luís

Luis · Post by **Luis** » Tue Apr 30, 2013 12:26 pm

Hi,

I've been thinking about all that but, despite saving me some 4 KiB blocks of memory, I still have the same problem using recursive paging. Ok, like Brendan said, I don't need to use more than 1 table for a complete structure but I still need the 4 KiB for that table. For example I have that multi purpose table plus 2 more page tables because the process is using 3 GiB of memory. Those 3 tables must exist and there is a virtual memory use of 512 GiB (doesn't really matter much to me) and 16 KiB of physical memory. As I use 2 MiB pages exclusively in order to keep those 3 pages in physical memory there will be a 2032 KiB of padding. Conclusion, whether i use recursive paging or not the problem remains. Brendan I think I didn't explain myself clearly so you misunderstood the problem. I believe it's better that I forget about recursive paging and keep the tables from all processes together in physical memory to avoid padding.

Luís

Brendan · Post by **Brendan** » Tue Apr 30, 2013 1:40 pm

Hi,

Luis wrote:For example I have that multi purpose table plus 2 more page tables because the process is using 3 GiB of memory. Those 3 tables must exist and there is a virtual memory use of 512 GiB (doesn't really matter much to me) and 16 KiB of physical memory. As I use 2 MiB pages exclusively in order to keep those 3 pages in physical memory there will be a 2032 KiB of padding.

Um. You'd allocate 2048 KiB for the process' PLM4 and waste 2044 KiB, then allocate 2048 KiB for one PDPT and waste 2044 KiB, then allocate a 2048 KiB for one PD and waste 2044 KiB.

The only way to avoid this would be to allow a 2 MiB page to be split up into multiple 4 KiB pages. Then you could (e.g.) allocate a 2 MiB page, then split it up and use 4 KiB for the process' PLM4, another 4 KiB for PDPT, etc. Later on the process might need another page directory, and you'd search for a free 4 KiB in one of those already allocated 2 MiB pages; or you might free a page directory and give the 4 KiB back.

However; if you're going to split 2 MiB pages into 4 KiB pages anyway (which will force you to keep track of which 4 KiB pages are being used and which aren't); then there's no sane reason not to make it part of your physical memory manager. That way one process might start and use 3 pages from a 2 MiB page, and another process might start and use another 3 pages from the same 2 MiB page that's already been split up.

Of course once your physical memory manager is able to allocate/free 4 KiB pages and you're using it for paging structures and nothing else; it would be natural to also use it for other things. For example, to avoid having a massive number of TLB misses (caused by trying to use a tiny number of "large page TLB entries" for everything), you might decide to improve performance a lot by using 4 KiB pages for kernel data too.

Cheers,

Brendan

Luis · Post by **Luis** » Tue Apr 30, 2013 2:12 pm

Hi,

Brendan wrote:Um. You'd allocate 2048 KiB for the process' PLM4 and waste 2044 KiB, then allocate 2048 KiB for one PDPT and waste 2044 KiB, then allocate a 2048 KiB for one PD and waste 2044 KiB.

No, all those would be in the same 2 MiB page wasting 2032 KiB.

Brendan wrote:The only way to avoid this would be to allow a 2 MiB page to be split up into multiple 4 KiB pages. Then you could (e.g.) allocate a 2 MiB page, then split it up and use 4 KiB for the process' PLM4, another 4 KiB for PDPT, etc. Later on the process might need another page directory, and you'd search for a free 4 KiB in one of those already allocated 2 MiB pages; or you might free a page directory and give the 4 KiB back.

However; if you're going to split 2 MiB pages into 4 KiB pages anyway (which will force you to keep track of which 4 KiB pages are being used and which aren't); then there's no sane reason not to make it part of your physical memory manager. That way one process might start and use 3 pages from a 2 MiB page, and another process might start and use another 3 pages from the same 2 MiB page that's already been split up.

Yes, my idea was to use some sort of 4 KiB pages inside the 2 MiB page but without using pages. I would just create the tables as needed in the allocated 2 MiB like creating creating arrays in a heap. I take a 2 MiB page and I can add PML4, PDPTs, PDs freely as needed. So I came with the idea, why not take the other processes tables and insert inside this same 2 MiB as well? That way I'm getting rid of the need for insane padding but I'm just afraid it might be incompatible with something or cause some sort of trouble for sharing the space to store tables from diferent processes.

Luís

OSDev.org

Page structures

Page structures

Re: Page structures

Re: Page structures

Re: Page structures

Re: Page structures

Re: Page structures

Re: Page structures

Re: Page structures

Re: Page structures

Re: Page structures

Re: Page structures

Re: Page structures

Re: Page structures

Re: Page structures

Re: Page structures