how many pagetables?

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
Seven11
Member
Member
Posts: 25
Joined: Mon Oct 30, 2006 12:48 pm

how many pagetables?

Post by Seven11 »

I have a design question... not sure whether to start supporting more than one pagedirectory. For instance: 1 for every user process (not threads) and one for the kernel. Up to now I have always used one pagedirectory for the whole system. How are you handling this problem?

I've read somewhere that software task switching is faster only if the CR3 isn't touched, does this hold true even for todays CPU:s like P4, Core Duo/2, PM, AMD64 and so on... anybody know?
Otter
Member
Member
Posts: 75
Joined: Sun Dec 31, 2006 11:56 am
Location: Germany

Post by Otter »

If you do want to seperate user- and kernel space you need one page directory per process and one for your kernel. If all your processes use the same page directory they can see each other in memory and one process can modify the data of another process.

You should start with one page directory. Your memory manager should be able reserve a new physcial page and to give access to this page so you could easily create a new page directory at runtime.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: how many pagetables?

Post by Brendan »

Hi,
Seven11 wrote:I have a design question... not sure whether to start supporting more than one pagedirectory. For instance: 1 for every user process (not threads) and one for the kernel. Up to now I have always used one pagedirectory for the whole system. How are you handling this problem?
I use one page directory per thread (but one page directory per process in more common), with kernel space mapped into each page directory. You don't need a special "kernel page directory" unless the kernel itself is considered as a seperate process (i.e. has "things" that are scheduled).
Seven11 wrote:I've read somewhere that software task switching is faster only if the CR3 isn't touched, does this hold true even for todays CPU:s like P4, Core Duo/2, PM, AMD64 and so on... anybody know?
Software task switching uses small/optimized instructions, while hardware task switching uses a big lump of microcode that no-one bothers optimizing because no OS uses it, so I wouldn't be surprised if software task switching was a little faster when the same amount of CPU state is saved/restored (just like the ENTER, LEAVE and LOOP instructions are slower compared to a sequence of smaller instructions, because they use microcode too).

In general, for software task switching you save less CPU state - often some general registers (and EIP, EFLAGS, etc) are already on the stack, and for most OS designs saving and restoring the segment registers (including the slow protection checks involved) can be skipped because they are always the same when running kernel code. This is mainly what makes software task switching faster.

Changing CR3 is slow because it invalidates the TLBs. It doesn't matter if you use hardware task switching or software task switching to change CR3 - the cost of invalidating the TLBs won't change. IMHO "task switching is faster only if the CR3 isn't touched" was never true for any 80x86 CPU....


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Mikae
Member
Member
Posts: 94
Joined: Sun Jul 30, 2006 1:08 pm

Post by Mikae »

I've read somewhere that software task switching is faster only if the CR3 isn't touched, does this hold true even for todays CPU:s like P4, Core Duo/2, PM, AMD64 and so on... anybody know?
P6 family (or starting from PIII?) allows to avoid full invalidation of TLB by setting up 'G' (global) flag in page table entry. Pages, having this flag set are not invalidated through changing CR3 register. You can mark your kernel's page as global, it will increase speed a little. Also Intel Manual recommends using 4MB page size for kernel and 4KB page size for user mode, it also increases execution speed cause different TLBs are used.
Seven11
Member
Member
Posts: 25
Joined: Mon Oct 30, 2006 12:48 pm

Post by Seven11 »

ah alright... well I guess I'll implement support for many pagetables then and 4 MiB pages aswell.


btw, Does your allocate pagetable function returns a physical or virtual page? in other words after finding a available page do you map:it into pd?
So far mapping and allocating has been two functions but since I can't think of a situation where you might need a non virtual page, I'm thinking about mixing them toghether to make it faster. What do you think about that?
Otter
Member
Member
Posts: 75
Joined: Sun Dec 31, 2006 11:56 am
Location: Germany

Post by Otter »

This depends on your system ... if you want to optimize it for speed you could map all needed page tables in kernel space. That means that you need more kernel space ( which is mapped into user space as well so you have less available virtual memory for your user space processes ) or you could map them temporally when some changes have to be done. This is a bit slower, but you don't need as much memory. it's up to you to decide what you want for your system
Post Reply