x86 Segmentation/Paging + OS Plans

Therx · Post by **Therx** » Sat Aug 09, 2003 12:55 pm

Ok, my OS needs a complete rewrite once again (5th time) and this time I am determined to plan it enough that I won't have to start again again. The main reason for the rewrite was the lack of support for anything except ring 0 code/data and no paging. I've had a glimpse at the Intel IA-32 System Programmers Guide and feel a bit over whelmed so here is what I'm planning, please forgive if it's half rubbish.

[glow=red,2,300][Deep Breath][/glow]

I plan to have these segments in the GDT:-

Ring 0, Code, 0 -> 300000h (3mb)
Ring 0, Data, 0 -> 300000h (3mb)
Ring 3, Code, 300000h -> (totalmem - 300000h) / 2
Ring 3, Data, (totalmem - 300000h) / 2 -> totalmem

The idea is that the app can't then execute data which gives a bit of extra security and neither can it access the 3mb of kernel space. Drivers will be run as part of the kernel.

From 0 to 3mb paging will just map virtual memory directly to the same place in physical memory but in the app area paging will be used on task switches to make sure that an app can't intentionally/accidentally access another app's space. ie The memory of the inactive apps won't be mapped to.

Any mistakes so far?

Pete[glow=red,2,300]TEXT[/glow][glow=red,2,300]TEXT[/glow]

Therx · Post by **Therx** » Sun Aug 10, 2003 5:03 am

Ok I've changed my plan. Segmentation will cause problems with far memory addressing etc. I will just use paging to specify what the current task can access. But I've a few questions:-

1. If I'm using paging to control memory access should my GDT just have a code and data segment covering the whole memory?

2. What ring level should these segments be?

3. How do I change ring level in software multitasking?

4. When I call is made to a hardware driver how do I make the ring change so that the driver can do IO or should the driver run as a ring 0 task and the call from the ring 3 app just add a command to a queue?

5. Is there a problem with this:-
Kernel Space to 3mb mapped to the same place in physical memory. Only the page tables for memory over 3mb change on a task switch

6. Is this as confusing as I think it is? ;D

Pete

Tim · Post by **Tim** » Sun Aug 10, 2003 5:12 am

Therx wrote:1. If I'm using paging to control memory access should my GDT just have a code and data segment covering the whole memory?

2. What ring level should these segments be?

You should have 4 segments: ring 0 code and data, and ring 3 code and data. Plus the null descriptor, of course.

3. How do I change ring level in software multitasking?

INT to go to a lower (more privileged) level, IRET to go to a higher (less privileged) level.

4. When I call is made to a hardware driver how do I make the ring change so that the driver can do IO or should the driver run as a ring 0 task and the call from the ring 3 app just add a command to a queue?

It's easier to have drivers running in ring 0, especially if all your driver management and I/O functions are in ring 0 as well. Otherwise, you can give drivers access to I/O ports using the I/O permissions bitmap in the TSS, and you can give them access to adapter memory using paging.

5. Is there a problem with this:-
Kernel Space to 3mb mapped to the same place in physical memory. Only the page tables for memory over 3mb change on a task switch

If I understand, you want to keep the kernel mapped into the same place at all times, and have the rest of the address space change when you switch processes. This is the 'right' way of doing it.

6. Is this as confusing as I think it is? ;D

Not when you get the hang of it.

Therx · Post by **Therx** » Sun Aug 10, 2003 5:25 am

You should have 4 segments: ring 0 code and data, and ring 3 code and data. Plus the null descriptor, of course.

Where would these segments go from to. If they don't both go from 0 to FFFFFFFF then I thought there where problems changing segments in C.

INT to go to a lower (more privileged) level, IRET to go to a higher (less privileged) level.

Uh, so after I've restored all the registers for a app (ring 3), I'd do asm("iret"); as normal but to change to a kernel task I'd do asm("int"); this doesn't make sense to me.

Thanks Tim think this is all slowly starting to sink in.

Pete

Solar · Post by **Solar** » Sun Aug 10, 2003 5:44 am

Therx wrote: Where would these segments go from to.

I would suggest "IA-32 Intel Architecture Software Developer's Manual", Volume 3, chapter 3.2 (Using Segments).

Uh, so after I've restored all the registers for a app (ring 3), I'd do asm("iret"); as normal but to change to a kernel task I'd do asm("int"); this doesn't make sense to me.

Personal opinion: Handle such "gory" stuff in all-assembler, instead of inline assembler. Unless you really know your stuff, the compiler will bite you, especially when it comes to function calling / returning.

Tim · Post by **Tim** » Sun Aug 10, 2003 5:58 am

Therx wrote:Where would these segments go from to. If they don't both go from 0 to FFFFFFFF then I thought there where problems changing segments in C.

Right, they would all have base = 0 and limit = 4GB. They would all look at the same memory, but will different access permissions (some code, some data, some ring 0, some ring 3).

Uh, so after I've restored all the registers for a app (ring 3), I'd do asm("iret"); as normal but to change to a kernel task I'd do asm("int"); this doesn't make sense to me.

Think of it in terms of entering and leaving the kernel. To execute a system call from user mode, you use INT. This will execute the (assembly language) interrupt handler in the kernel, which can call the syscall handler to do what it needs to. The last instruction in the interrupt handler will be IRET, which resumes execution in user mode.

As far as applications are concerned, the kernel is just a collection of subroutines that you call using INT.

Therx · Post by **Therx** » Sun Aug 10, 2003 6:00 am

Ok

Code: Select all

KERN_DATA equ $-gdt           ; Ring 0 Data
         dw  0FFFFh
         dw  0
         db  0
         db  92h
         db  0CFh
         db  0

KERN_CODE equ $-gdt           ; Ring 0 Code
         dw  0FFFFh
         dw  0
         db  0
         db  9Ah
         db  0CFh
         db  0

APP_DATA equ $-gdt           ; Ring 3 Data
         dw  0FFFFh
         dw  0
         db  0
         db  F2h
         db  0CFh
         db  0

APP_CODE equ $-gdt           ; Ring 3 Code
         dw  0FFFFh
         dw  0
         db  0
         db  FAh
         db  0CFh
         db  0

Would those segment entries (with null descriptor) give me the write GDT. I think its write having read John Fine's tutorial on GDT,LDT and IDT. But still I really don't get this changing ring level business. If I'm meant to do an INT to stay at a high privileged level then why in my current 'OS' do I do an 'iret' at the end of my longjmp and remain in ring 0. Surely from what Tim said that would demote me to ring 1, 2 and eventually 3.

Thanks for all the help. I did have the Intel Manual but as with most intel things finding the relevant part is a bit of a struggle.

Pete

Therx · Post by **Therx** » Sun Aug 10, 2003 6:07 am

Sorry Tim. Missed your second post. I know that a int handler can be set as what ever privilege level you want in the IDT and that calling a int handler to do something would go to ring 0 and be fine. But if there is a driver (e.g. vga) which works as a task how do I make it so that when I switch to that task I go into ring 0 and then on the next switch to a normal app it returns to ring 3.

Thanks

pete

Tim · Post by **Tim** » Sun Aug 10, 2003 7:12 am

Give ring 0 tasks different selector values from ring 3 tasks.

Therx · Post by **Therx** » Sun Aug 10, 2003 7:18 am

???
Huh,

Selector Values? I confused. Does any code in memory defined as 'kernel' in the page tables run as ring 0 and code in 'user' memory run in ring 3?

Tim · Post by **Tim** » Sun Aug 10, 2003 8:37 am

No. Segmentation is still in effect even if you use flat descriptors. The DPL of CS still determines the current ring; that is, CPL is always the DPL of the descriptor that CS points to.

The reason INT works to move from ring 3 to ring 0 is because the CPU takes the selector from the IDT entry and assigns it to CS. If the selector in the IDT entry is a ring 0 selector, then the CPU will switch to ring 0 when that interrupt happens. Similarly, on IRET, the CPU reloads the previous selector from the stack, which usually causes a switch back to ring 3. If the selector on the stack is a ring 0 selector (because an interrupt happened when the CPU was already in kernel mode) the CPU will stay in kernel mode.

Therx · Post by **Therx** » Sun Aug 10, 2003 9:49 am

From Intel Section 4.5 of Volume 3 of the Software Developers Manual:-

Current privilege level (CPL). The CPL is the privilege level of the currently executing
program or task. It is stored in bits 0 and 1 of the CS and SS segment registers. Normally,
the CPL is equal to the privilege level of the code segment from which instructions are
being fetched. The processor changes the CPL when program control is transferred to a
code segment with a different privilege level. The CPL is treated slightly differently when
accessing conforming code segments. Conforming code segments can be accessed from
any privilege level that is equal to or numerically greater (less privileged) than the DPL of
the conforming code segment. Also, the CPL is not changed when the processor accesses a
conforming code segment that has a different privilege level than the CPL.

If the CPL is stored in bits 1 & 0 of CS then surely it will always be the same as the segment from where the instruction is fetched as that is always CS.

What stops an app (ring 3) changing the bits 1 & 0 of CS to 0 and doing:-

Code: Select all

jmp KERN_CODE:me
me: ; Now we're ring 0

from what I've read the CPL will be changed to 0 and they'll be able to do everything. Or while in ring 3 is it impossible to edit CS or SS.

If so to change the ring of a task? Can I get a ring 0 task to change the stored CS and SS registers to a kernel segment and change the first 2 bits (because if the kernel and app segments have the same base and limit the offset will be the same)?

And section 3.4.1 states that bits 0 & 1 of segment selectors are the RPL but I guess this is only for DS, ES, FS and GS.

Thanks for any hints

Pete

Tim · Post by **Tim** » Sun Aug 10, 2003 10:01 am

Therx wrote:What stops an app (ring 3) changing the bits 1 & 0 of CS to 0 and doing:-
Code: Select all
jmp KERN_CODE:me
me: ; Now we're ring 0
from what I've read the CPL will be changed to 0 and they'll be able to do everything. Or while in ring 3 is it impossible to edit CS or SS.

If you want to go more privileged, you have to do it through an interrupt/call/task gate. If you want to go less privileged, you have to do it through an IRET, or a RET which causes a task switch. An attempt to do a far JMP or CALL like your code will result in a fault.

If so to change the ring of a task? Can I get a ring 0 task to change the stored CS and SS registers to a kernel segment and change the first 2 bits (because if the kernel and app segments have the same base and limit the offset will be the same)?

Yes, but you usually won't need to change a task's ring. You'll find it easier to have separate user-mode and kernel-mode tasks. Remember, a user-mode task can run in kernel mode within an interrupt handler.

Therx · Post by **Therx** » Sun Aug 10, 2003 11:11 am

Thanks everyone it's much clearer now.

So for a task switch I'll do:-

1. call setjmp
2. if it doesn't return 0 then return
3. work out next task
4. load new page tables
5. call longjmp for next task

setjmp:-
1. pushad - General Purpose Registers
2. pushfd - EFLAGs Register (interrupts etc.)
3. push CS - Code Selector
4. push SS - Stack Selector
5. push DS to GS - Data Selectors
6. push EIP - Instruction Pointer
8. move ESP into a store variable for the task
9. return 0

longjmp:-
1. move variable for new task to ESP
2. pop EIP
3. pop GS to DS
...
7. pushad
8. move 1 into EAX (return value)
9. iret

Is this all good? Or have I over looked something again?

Thanks Tim

Pete

Tim · Post by **Tim** » Sun Aug 10, 2003 11:27 am

This is OK, but it doesn't consider ring switching. setjmp and longjmp are OK for task switching within one program, but they break down when you start trapping into the kernel.

setjmp becomes an INT instruction plus the first half of the interrupt handler (saving registers). longjmp becomes the second half of the interrupt handler (restoring registers) plus IRET.

So, to task switch:
1. Issue a task switch interrupt

Task switch interrupt handler:
1. pushad - General Purpose Registers
2. push DS, ES, FS and GS
3. move ESP into a store variable for the task
4. choose a new task
5. move variable for new task to ESP
6. pop GS, FS, ES and DS
7. popad
8. iret

Points to note:
-- No need to push CS, EIP or EFLAGS, as the CPU has already done this before the handler is called
-- Can't pop eip at all (no such instruction)
-- Mustn't modify EAX at the end of the interrupt handler, because the interrupt could be called at any point (must save and restore all registers)

OSDev.org

x86 Segmentation/Paging + OS Plans

x86 Segmentation/Paging + OS Plans

Re:x86 Segmentation/Paging + OS Plans

Re:x86 Segmentation/Paging + OS Plans

Re:x86 Segmentation/Paging + OS Plans

Re:x86 Segmentation/Paging + OS Plans

Re:x86 Segmentation/Paging + OS Plans

Re:x86 Segmentation/Paging + OS Plans

Re:x86 Segmentation/Paging + OS Plans

Re:x86 Segmentation/Paging + OS Plans

Re:x86 Segmentation/Paging + OS Plans

Re:x86 Segmentation/Paging + OS Plans

Re:x86 Segmentation/Paging + OS Plans

Re:x86 Segmentation/Paging + OS Plans

Re:x86 Segmentation/Paging + OS Plans

Re:x86 Segmentation/Paging + OS Plans