Page 1 of 1
how many pagetables?
Posted: Mon Feb 05, 2007 9:20 am
by Seven11
I have a design question... not sure whether to start supporting more than one pagedirectory. For instance: 1 for every user process (not threads) and one for the kernel. Up to now I have always used one pagedirectory for the whole system. How are you handling this problem?
I've read somewhere that software task switching is faster only if the CR3 isn't touched, does this hold true even for todays CPU:s like P4, Core Duo/2, PM, AMD64 and so on... anybody know?
Posted: Mon Feb 05, 2007 9:43 am
by Otter
If you do want to seperate user- and kernel space you need one page directory per process and one for your kernel. If all your processes use the same page directory they can see each other in memory and one process can modify the data of another process.
You should start with one page directory. Your memory manager should be able reserve a new physcial page and to give access to this page so you could easily create a new page directory at runtime.
Re: how many pagetables?
Posted: Mon Feb 05, 2007 11:05 am
by Brendan
Hi,
Seven11 wrote:I have a design question... not sure whether to start supporting more than one pagedirectory. For instance: 1 for every user process (not threads) and one for the kernel. Up to now I have always used one pagedirectory for the whole system. How are you handling this problem?
I use one page directory per thread (but one page directory per process in more common), with kernel space mapped into each page directory. You don't need a special "kernel page directory" unless the kernel itself is considered as a seperate process (i.e. has "things" that are scheduled).
Seven11 wrote:I've read somewhere that software task switching is faster only if the CR3 isn't touched, does this hold true even for todays CPU:s like P4, Core Duo/2, PM, AMD64 and so on... anybody know?
Software task switching uses small/optimized instructions, while hardware task switching uses a big lump of microcode that no-one bothers optimizing because no OS uses it, so I wouldn't be surprised if software task switching was a little faster when the same amount of CPU state is saved/restored (just like the ENTER, LEAVE and LOOP instructions are slower compared to a sequence of smaller instructions, because they use microcode too).
In general, for software task switching you save less CPU state - often some general registers (and EIP, EFLAGS, etc) are already on the stack, and for most OS designs saving and restoring the segment registers (including the slow protection checks involved) can be skipped because they are always the same when running kernel code. This is mainly what makes software task switching faster.
Changing CR3 is slow because it invalidates the TLBs. It doesn't matter if you use hardware task switching or software task switching to change CR3 - the cost of invalidating the TLBs won't change. IMHO "task switching is faster only if the CR3 isn't touched" was never true for any 80x86 CPU....
Cheers,
Brendan
Posted: Mon Feb 05, 2007 3:37 pm
by Mikae
I've read somewhere that software task switching is faster only if the CR3 isn't touched, does this hold true even for todays CPU:s like P4, Core Duo/2, PM, AMD64 and so on... anybody know?
P6 family (or starting from PIII?) allows to avoid full invalidation of TLB by setting up 'G' (global) flag in page table entry. Pages, having this flag set are not invalidated through changing CR3 register. You can mark your kernel's page as global, it will increase speed a little. Also Intel Manual recommends using 4MB page size for kernel and 4KB page size for user mode, it also increases execution speed cause different TLBs are used.
Posted: Tue Feb 06, 2007 5:46 am
by Seven11
ah alright... well I guess I'll implement support for many pagetables then and 4 MiB pages aswell.
btw, Does your allocate pagetable function returns a physical or virtual page? in other words after finding a available page do you map:it into pd?
So far mapping and allocating has been two functions but since I can't think of a situation where you might need a non virtual page, I'm thinking about mixing them toghether to make it faster. What do you think about that?
Posted: Tue Feb 06, 2007 8:05 am
by Otter
This depends on your system ... if you want to optimize it for speed you could map all needed page tables in kernel space. That means that you need more kernel space ( which is mapped into user space as well so you have less available virtual memory for your user space processes ) or you could map them temporally when some changes have to be done. This is a bit slower, but you don't need as much memory. it's up to you to decide what you want for your system