Page 1 of 1

On the high quarter philosophy

Posted: Sun Sep 23, 2007 4:35 pm
by Morpheu5
Hello everybody, I'm new to the forum but I've been studying OS dev for some years now. I've been practicing in my spare time on the ia-32 arch with all the basic stuff such as boot loading, protected mode, ints and such, but in these days I'm getting more and more interested in serious OS dev as I'm probably going to work in an OS research team.

Back to the subject. I've been reading a lot about this higher half (or quarter) matter but the thing I'm really missing is the reason for mapping the kernel at virtual 0xc0000000 or whatever you like. From all my readings, it seems that the most of the people are simply loading their kernels at - say - 1M and then map 0xc0000000-0xd0000000 (say, 256M) to 0x0-0x10000000. Then they map the user memory from 0x10000000 to the end of the address space.

I'm really missing the reason for this. Wouldn't it be easier to map the kernel address space 1:1 from 0x0 to whatever you'd need (wait, a kernel and data and heap that takes up to 256M?) and then map the rest of the space to whatever comes after?

I'm saying this because my toy kernel is making some useful things with the available memory blocks below 1M - i.e. kernel-space global structures - and the kernel itself is loaded at 1M by GRUB with its data and bss sections just after it. I was thinking that I could just set up a bunch of Megs for the heap, map from 0x0 to 0xTheEndOfTheHeap as kernel space and do other things with 0xTheEndOfTheHeap + 1 to 0xTheEndOfTheMemory - 1...

Thanks for any hint on this.

Posted: Sun Sep 23, 2007 4:56 pm
by frank
I believe most people map the kernel to upper addresses because it creates a nice split in the address space. ie. Applications have 0-2gb the kernel has the rest. Also mapping it too higher address in a 64bit mode kernel allows 32 bit software to use the whole 4gb address space.

Posted: Sun Sep 23, 2007 5:08 pm
by Morpheu5
frank wrote:Also mapping it too higher address in a 64bit mode kernel allows 32 bit software to use the whole 4gb address space.
Yes, this is a good point I didn't take into account. So, now to speak about the heap, can I map the kernel so it hangs from the top of the addressing space (made exception for the eventual ROM code and other reserved areas) and have all the lower space for apps? That is, is there any real need for the kernel to have one or two GB of heap space?

Moreover, going to 64bits, I assume it's still safe to keep this layout (the kernel hanging from 4GB) having the lower space for both 32 and 64 bit apps and the higher space (above 4GB) just for 64 bit apps, am I wrong?

EDIT: though with 64 bit I can simply map the kernel above 4GB, of course...

Thanks for the quick reply.

Posted: Sun Sep 23, 2007 5:24 pm
by frank
Well the kernel and data will probably never take up the whole address space but think about it you have lots of other things that you could map into the kernel space. Video memory, device drivers, page directories and page tables, memory mapped devices and so on.

Posted: Mon Sep 24, 2007 2:59 am
by AJ
The main reason I map my kernel to 0xC0000000 is for ease (and speed) of communication between kernel and user programs. The kernel is mapped in to the top of *all* process address spaces. The user app may then run between 0x00000000-0xBFFFFFFF.

If you do not use this method, you have to switch memory space twice for every system call (TLBs getting flushed etc...). This is extremely expensive, especially considering that sbrk() and, fopen(), fclose(), fflush() et. al. are going to be calling kernel services.

If the kernel is mapped in to each process space, no memory space switching is necessary.

Cheers,
Adam

Posted: Mon Sep 24, 2007 5:02 am
by Morpheu5
AJ wrote:The kernel is mapped in to the top of *all* process address spaces. The user app may then run between x00000000-0xBFFFFFFF.
If I'm getting it right, you're saying that both user apps and kernel can see into each other space (while not eventually writing, of course).
AJ wrote:If the kernel is mapped in to each process space, no memory space switching is necessary.
That is, kernel has its own page dir that says "0xc0000000-0xffffffff are belong to us, 0x0-0xbfffffff are belong to user apps and I can write everything" while the user apps have their own (single?) page dir that says "0x0-0xbfffffff is ours and 0xc0000000-0xffffffff is taboo", am I right?

What about the segments in the GDT, then?
I'd have to launch the kernel with something like the TimRobinson's trick, set the kernel Descriptors to 0x0-(0x0-1) so it can address the whole space and then enable paging with the correct mapping. I assume that the user apps must have the exact same Global Descriptors except for the PL. So the result would be:
kernel: physically loaded at 0x100000, virtually mapped from 0xc0000000<->0x0 to (say) 0xcfffffff<->0xfffffff.
user: physically loaded from (say) 0x10000000, virtually mapped from 0x0<->0x10000000 to 0xbfffffff<->(0x0-1)
Am I missing something? I have to recheck the part in which I have to do context switching between the kernel and the user apps.

I see that Linux (at least) is using an INT (0x80, I think) to do syscalls. I used to think that this is the only way to go since a PL jump is involved. Is there some other way?

Thanks a lot to everybody.

Posted: Mon Sep 24, 2007 6:10 am
by JamesM
The main reason I map my kernel to 0xC0000000 is for ease (and speed) of communication between kernel and user programs. The kernel is mapped in to the top of *all* process address spaces. The user app may then run between 0x00000000-0xBFFFFFFF.
I do that, but I use a lower-half kernel. The kernel code/data/bss is loaded where GRUB puts it, and is identity mapped across all address spaces. The kernel and user heaps are up at 0xC000000 and 0xD0000000 respectively.
If I'm getting it right, you're saying that both user apps and kernel can see into each other space (while not eventually writing, of course).
That is correct.
That is, kernel has its own page dir that says "0xc0000000-0xffffffff are belong to us, 0x0-0xbfffffff are belong to user apps and I can write everything" while the user apps have their own (single?) page dir that says "0x0-0xbfffffff is ours and 0xc0000000-0xffffffff is taboo", am I right?
Not quite. Every process has it's own page directory. But across ALL process page directories, certain areas are mapped to EXACTLY the same physical addresses. Those being the kernel code/data/bss and kernel heap, mmapped devices, etc.
Am I missing something? I have to recheck the part in which I have to do context switching between the kernel and the user apps.
The only thing you're missing is what I mentioned above. There is no dedicated kernel page directory, so there is no need for a full context-switch when performing a syscall.
I see that Linux (at least) is using an INT (0x80, I think) to do syscalls. I used to think that this is the only way to go since a PL jump is involved. Is there some other way?
Yes. As of the pentium pro, you can use the SYSENTER/SYSEXIT instructions to change privilege level. I use these myself, but most tend to still use INT 0x80 or similar. Also note that in the x86_64 architecture there are SYSCALL/SYSRET instructions that are nicer than SYSENTER/EXIT.

Posted: Mon Sep 24, 2007 7:04 am
by AJ
Morpheu5 wrote: That is, kernel has its own page dir that says "0xc0000000-0xffffffff are belong to us, 0x0-0xbfffffff are belong to user apps and I can write everything" while the user apps have their own (single?) page dir that says "0x0-0xbfffffff is ours and 0xc0000000-0xffffffff is taboo", am I right?
One important point here is that a 'User/Supervisor' bit in the page directory means that the ring 3 task (user app) cannot read/write to kernel space. Of course ring 0 code (the kernel) can read/modify user space data. There is therefore no need to switch page dir at all when handling a syscall.

Cheers,
Adam