OSDev.org

Posted: **Fri Dec 03, 2010 10:55 am**

I know this has been has mentioned many times before but I've never seen any real conclusion of it. There are tons of information and books on operating system design but I've never seen any of them really touching this subject, probably because it's more implemented oriented rather than theoretical and very architecture dependent.

You going to have to access the page table in another process sooner or later. Let's say you're low on pages and you have to evict some to disk, you get a few page candidates that might reside in several processes. You have to mark these pages as not present in the page table meaning to have to modify it even if that process isn't running.

Usually you want to map the user page table in the user address space since it is quite large. Having them in kernel virtual space would quickly eat up the space and also greatly limit the amount processes you can have.

I've seen and thought of a few options.

1. You have a "physical addressed kernel" meaning all pages in kernel are also mapped at a location representing the physical location with an offset. So if some page table is allocated at 0x10000 it's mapped at 0xc0010000 if you have a high memory kernel. This has the restriction that kernel pages has be allocated from the lower end of the physical memory. Now you can parse the page table by reading "reading physical memory", you just add an offset to the physical address given in the page table.
2. Copy in the page table into a temporary area in the kernel. This means that you have to copy the page table entires to the kernel page table. The amount entries needed to be copied is quite large, with 32-bit there are 512 of them if you have a 2GB/2GB split, with 64-bit much more. On a 32-bit system if your page table is 4MB aligned, you perhaps can get off by only coping the page directory which is one or two entries. Also you have get those entries and the question is how to do that. Temporary map the page table that is mapping the page table?
3. Parse the page table in physical mode. Switch off the MMU, have a portion of the kernel that runs in physical memory. Now you can parse the page table quite easily. Problem with this is that some architectures also switch off the cache which will greatly slow down the code.
4. Temporary switch the entire address space. You basically set the other process page table (write to CR3 for example, or copy part of the global directory) and do the operations from there. As soon you do this you will flush the TLB or confuse it as your ASID will be wrong. This will work as you're in kernel space, but what happens if you get an interrupt or the process is preempted during that time. You don't really want to lock out interrupts for that long time.

These are a few options and they all seems to do something that is either awfully slow, hairy to implement or waste lots of virtual space. This is those times I would love to have a restricted load and store physical instruction bypassing the MMU. That would solve this all together.

This is really basic stuff but I can't really get my head around this.

Posted: **Fri Dec 03, 2010 2:12 pm**

It is quite simple, and requires no copying. You just reserve a single page directory entry. Then you put the CR3-value into this entry (and possibly flush the TLB). Then you can access all the process page-tables in a 4MB memory area (32 bit). However, it is necesary to first check that the process page-directory is present.

This is just an extension to the usual mapping needed in order to be able to access the page tables in the current process.

Posted: **Fri Dec 03, 2010 2:38 pm**

rdos wrote:It is quite simple, and requires no copying. You just reserve a single page directory entry. Then you put the CR3-value into this entry (and possibly flush the TLB). Then you can access all the process page-tables in a 4MB memory area (32 bit). However, it is necesary to first check that the process page-directory is present.

This is just an extension to the usual mapping needed in order to be able to access the page tables in the current process.

Does this method require that the achitecture supports recursive page tables (also known as the page table point to itself trick)? I'm actually currently using ARM as a development platform which doesn't seem to have this property.

Posted: **Fri Dec 03, 2010 2:46 pm**

Yes, that method is the recursive mapping technique, so I suppose you can't use it (assuming you can't - I don't know ARM well). However, the idea still makes sense. There should be some way to have a section of virtual memory with all of the page mappings in it. All you have to do is store a normal page directory and have it mirrored in the proper format for a page table (using x86 terminology), then use that page table, and make sure they stay in sync. If you need to modify the mappings for another process, just use that page table. This is a little messier and causes a duplication of data, but it has the same benefits as recursive mapping, i.e. it takes little virtual memory and has nice addressing properties.

Posted: **Fri Dec 03, 2010 2:55 pm**

Hi,

I normally do the "self reference" trick (where the page directory/ies, page tables, etc. for a specific address space are always mapped into that address space), so it's always easy to modify the current address space.

I map the highest level of the paging structures for all address spaces into kernel space, so I can modify the second-highest level of the paging structures easily. For example (for "plain 32-bit paging") I'd have an array of page directory mappings (one page directory per entry) so I can modify page directory entries in any/all address spaces easily. Modifying a page table entry in one address space (via. the "self reference" trick) will effect all address spaces that use that same page table. For user-space it's mostly the same - e.g. (for "plain 32-bit paging") all threads in the same process all share the same page tables.

For anything more than that (e.g. deciding which pages to send to swap space) I'd do the work while I'm in that address space; even if this means modifying the task's EIP (in its "task control block") so it points to a special piece of code in the kernel and then doing a full task switch to make the task execute that special piece of code in the kernel (to prevent IRQ handlers from seeing inconsistent state).

Cheers,

Brendan

Posted: **Fri Dec 03, 2010 2:59 pm**

Hi,

OSwhatever wrote:Does this method require that the achitecture supports recursive page tables (also known as the page table point to itself trick)? I'm actually currently using ARM as a development platform which doesn't seem to have this property.

If the architecture supports recursive page tables then it makes things much easier. Otherwise you'd need to create the mappings manually (e.g. allocate extra page tables to map page tables, and do extra work to keep it up-to-date).

Cheers,

Brendan

Posted: **Fri Dec 03, 2010 3:13 pm**

NickJohnson wrote:Yes, that method is the recursive mapping technique, so I suppose you can't use it (assuming you can't - I don't know ARM well). However, the idea still makes sense. There should be some way to have a section of virtual memory with all of the page mappings in it. All you have to do is store a normal page directory and have it mirrored in the proper format for a page table (using x86 terminology), then use that page table, and make sure they stay in sync. If you need to modify the mappings for another process, just use that page table. This is a little messier and causes a duplication of data, but it has the same benefits as recursive mapping, i.e. it takes little virtual memory and has nice addressing properties.

ARM is not as nice as x86 when it comes to page table management. You have 16kb L1 page table and 4MB of L2 table table in virtual space. To do the same for this architecture you have to:

If you have the L2 page table 4MB aligned you just need to copy the L1 entry to the kernel page table.
Then you have to copy 4 entries of L1 page table to the kernel page table. You're probably better mapping the physical pages directly.
But that's not all, you also have to map the 4kb page table that describes the 4MB page table into kernel virtual space in the appropriate place so we can add and remove pages to the user page table itself.
Now, the problem is that this requires that you can access both L1 and L2 page (at least those that describes the page table) tables which is currently not mapped at all. Then you have to temporary map up those as well.

As you can see it is much more compicated here unless I've missed something ingenious with the ARM architecture.

x86 is easy because CR3 value is already stored in process structure in kernel memory.

ARMv7 on the other hand has two page table registers one for user and one for kernel, so the question is if it faster just to load another process page table writing to this register. However, I have to lock out preemtion during that time I think.

Posted: **Fri Dec 03, 2010 3:22 pm**

Brendan wrote:Hi,

I normally do the "self reference" trick (where the page directory/ies, page tables, etc. for a specific address space are always mapped into that address space), so it's always easy to modify the current address space.

I map the highest level of the paging structures for all address spaces into kernel space, so I can modify the second-highest level of the paging structures easily. For example (for "plain 32-bit paging") I'd have an array of page directory mappings (one page directory per entry) so I can modify page directory entries in any/all address spaces easily. Modifying a page table entry in one address space (via. the "self reference" trick) will effect all address spaces that use that same page table. For user-space it's mostly the same - e.g. (for "plain 32-bit paging") all threads in the same process all share the same page tables.

For anything more than that (e.g. deciding which pages to send to swap space) I'd do the work while I'm in that address space; even if this means modifying the task's EIP (in its "task control block") so it points to a special piece of code in the kernel and then doing a full task switch to make the task execute that special piece of code in the kernel (to prevent IRQ handlers from seeing inconsistent state).

Cheers,

Brendan

I thought about this and wasting 16KB virtual kernel space for each process can be acceptable in order to map L1 page table into kernel space. It will still be messier than x86 supporting self reference trick but it would make it easier than having it all reside in user virtual space. No need to temporary map in parts of user page table just in order to obtain the page entries in that case.

Posted: **Fri Dec 03, 2010 3:26 pm**

The amount entries needed to be copied is quite large, with 32-bit there are 512 of them if you have a 2GB/2GB split, with 64-bit much more.

It's worth pointing out that on a 64bit machine there is currently room to map the entire physical address space in the kernel at all times. So this problem needn't occur on, for example, x86_64.

Posted: **Fri Dec 03, 2010 3:44 pm**

gerryg400 wrote:
The amount entries needed to be copied is quite large, with 32-bit there are 512 of them if you have a 2GB/2GB split, with 64-bit much more.
It's worth pointing out that on a 64bit machine there is currently room to map the entire physical address space in the kernel at all times. So this problem needn't occur on, for example, x86_64.

The completely physical mapped kernel meaning that code and data use physical memory will have problems supporting hot swapping memories and memory compactation for partial self refresh.

The ability to support partial self refresh was the main reason I didn't want a physical mapped kernel otherwise I would probably been much more open to the idea. The question is if it will be necessary if future memories will support selectable partial self refresh in finer steps.

Of course you can map an extra region by the side just mapping the entire linear physical space for 64-bit system. That would have been nice actually.

OSDev.org

How to modify another user process page table

How to modify another user process page table

Re: How to modify another user process page table

Re: How to modify another user process page table

Re: How to modify another user process page table

Re: How to modify another user process page table

Re: How to modify another user process page table

Re: How to modify another user process page table

Re: How to modify another user process page table

Re: How to modify another user process page table

Re: How to modify another user process page table