Page 1 of 4

Switching Segments Causes Page Fault

Posted: Mon Nov 14, 2005 6:58 pm
by TheChuckster
I'm trying to implement user task protection by having my tasks run in ring 3. Before I can do that though I need to at least able to change segments. I try to do this by setting up my stack to use the new segments so that the task switcher pops them into place.

My GDT has the following entries because I haven't even bothered enabling ring 3 yet:

Code: Select all

gp.limit = (sizeof(struct gdt_entry) * 5) - 1;
gp.base = &gdt;

gdt_set_gate(0, 0, 0, 0, 0);
gdt_set_gate(1, 0, 0xFFFFFFFF, 0x9A, 0xCF); // Ring 0 (kernel) CS and DS
gdt_set_gate(2, 0, 0xFFFFFFFF, 0x92, 0xCF);

gdt_set_gate(3, 0, 0xFFFFFFFF, 0x9A, 0xCF); // Ring 3 (user processes) CS and DS
gdt_set_gate(4, 0, 0xFFFFFFFF, 0x92, 0xCF);
Well my goal is to have my tasks now run on entry 3 (CS) and entry 4 (DS) instead of 1 and 2 like the kernel. 0x18 and 0x20 are my selectors for code and {DS, ES, FS, GS} respectively. Now the problem is, I get a weird page fault when I try to run this code. My tasks are now trying access bogus address 0x76000008. Surely permissions on my pages are right since I haven't ventured into the realm of ring 3. What is messing up my paging?

Re:Switching Segments Causes Page Fault

Posted: Tue Nov 15, 2005 12:19 pm
by fraserjgordon
I'd guess the problem is in gdt_set_gate() so we will need to see the code for it to be of much help.

Re:Switching Segments Causes Page Fault

Posted: Tue Nov 15, 2005 6:50 pm
by Slasher
Are your level 3 tasks linked with your kerenl? i.e as functions

You will get a page fault cause you are trying to run code in a Supervisor mapped memory region as User task in level 3. I had this problem.

Try this:-
once you have loaded the kernel that contains the level 3 code, copy the level 3 test code(user program) to a memory region that you have mapped as USER memory.

Then build the process with that memory address.

Hope this helps.

Re:Switching Segments Causes Page Fault

Posted: Tue Nov 15, 2005 6:53 pm
by TheChuckster
I'm not even doing Ring 3 yet. Everything's ring 0. The only difference is the GDT entry being used.

My tasks only call the kernel through system calls. They are not linked with the kernel. They are separate ELF binaries.

Re:Switching Segments Causes Page Fault

Posted: Thu Nov 17, 2005 4:00 pm
by TheChuckster
Fixed that. Now I am having a problem with leaving interrupts causing a GPF. These interrupts are the task switch timer interrupts.

I see some anomalies with the registers at the crash site.

My data segments (DS-GS) are set to the ring 3 data segment 0x23 while my code segment (CS) is set to the ring 0 code segment 0x08. Shouldn't they be both set to either one or the other?

My stack segment (SS) is set to my ring 0 code segment. Why is my stack segment selector getting set to it instead of my code segment selector?

The Bochs disassembly of the instructions (my interrupt handler -- what a coincidence) up to and including the crash EIP 0x001001d9.

Code: Select all

001001b6: pushad                    ; 60
001001b7: push ds                   ; 1e
001001b8: push es                   ; 06
001001b9: push fs                   ; 0fa0
001001bb: push gs                   ; 0fa8
001001bd: mov eax, 0x10             ; b810000000
001001c2: mov ds, ax                ; 8ed8
001001c4: mov es, ax                ; 8ec0
001001c6: mov fs, ax                ; 8ee0
001001c8: mov gs, ax                ; 8ee8
001001ca: push esp                  ; 54
001001cb: call .+0x103127           ; e8572f0000
001001d0: mov esp, eax              ; 89c4
001001d2: pop gs                    ; 0fa9
001001d4: pop fs                    ; 0fa1
001001d6: pop es                    ; 07
001001d7: pop ds                    ; 1f
001001d8: popad                     ; 61
001001d9: iretd                     ; cf
Something's telling me I'm missing out on some obvious quirk of the Intel architecture in my interrupt handler. Should I be doing a far jump to set the CS? My task stack frame is setting the CS to the ring 3 value, but it's appearing in SS. What gives?

Here's how I set my stack frame:

Code: Select all

stack = (unsigned int*)Tasks[id].esp0;

if(!VM86)
        *--stack = 0x0202; //This is EFLAGS
else
        *--stack = 0x20202;

if (ring == 0)
        *--stack = 0x08;   //This is CS, our code segment
else
        *--stack = 0x1B;

*--stack = (unsigned int)thread; //This is EIP
 
*--stack = 0; //EAX
*--stack = 0; //ESI
*--stack = 0; //EBP
*--stack = (unsigned int)Tasks[id].esp3; //ESP
*--stack = 0; //EBX
*--stack = 0; //EDX
*--stack = 0; //ECX
*--stack = 0; //EAX
 
if (ring == 0)
{
        *--stack = 0x10; //DS
        *--stack = 0x10; //ES
        *--stack = 0x10; //FS
        *--stack = 0x10; //GS
} else {
        *--stack = 0x23; //DS
        *--stack = 0x23; //ES
        *--stack = 0x23; //FS
        *--stack = 0x23; //GS
}
 
Tasks[id].esp0 = (unsigned int)stack;
[New:] Argh, looks like the Mega-Tokyo forum database got messed up and my post got lost in the process. Had to retype the whole thing. I'm saving my lengthy posts on my hard drive from now on.

Re:Switching Segments Causes Page Fault

Posted: Thu Nov 17, 2005 4:33 pm
by JAAman
TheChuckster wrote: [New:] Argh, looks like the Mega-Tokyo forum database got messed up and my post got lost in the process. Had to retype the whole thing. I'm saving my lengthy posts on my hard drive from now on.
ya prob a good idea: it looks like a couple of replys i made got lost somewhere (and i couldn't get back for almost 24hrs!)


it looks like your trashing your stack:
001001ca: push esp ; 54
001001cb: call .+0x103127 ; e8572f0000
001001d0: mov esp, eax ; 89c4
001001d2: pop gs ; 0fa9
you have a push esp but no pop esp?
this moves your whole frame up one machine-word:

Code: Select all

pop gs <- old(esp)
pop fs <- old(gs)
pop es <- old(fs)
pop ds <- old(es)
popad <- 
       edi <- old(ds)
       esi <- old(edi)
       ebp <- old(esi)
       --- <- old(ebp)               -- this is ignored
       ebx <- old(esp)
       edx <- old(ebx)
       ecx <- old(edx)
       eax <- old(ecx)

iret then takes the
eip<- old (eax)
cs <-old(eip)
eflags<-old(cs)
esp<-old(eflags)
ss<-old(esp)

and old(ss) is left on stack
and because this 'pops' the wrong cs:eip and ss:esp, this will likely cause a crash

Re:Switching Segments Causes Page Fault

Posted: Thu Nov 17, 2005 5:28 pm
by TheChuckster
GCC calling convention allows it to work fine. It's only when I throw ring 3 into the mix where things go awry. If stack was being thrashed, it would crash even with ring 0.

Re:Switching Segments Causes Page Fault

Posted: Thu Nov 17, 2005 9:23 pm
by JAAman
GCC calling convention allows it to work fine
how is this? mabey someone could explain this to me?

i always thought C functions returned with a 'ret' but that would not permit this! this means that functions return with a:

Code: Select all

mov edx,[return_pointer_from_stack]
jmp edx
is this really how it happens? cause this sounds strange (and highly inefficient) to me
i'm not a C wizard (i currently use mostly asm)

i guess it could instead use a jump indirect

Re:Switching Segments Causes Page Fault

Posted: Fri Nov 18, 2005 2:17 pm
by TheChuckster
The stack push/pop pairing is definitely not the problem if the isolating factor is the ring 3 tasks. If the isolating factor were say multitasking altogether, then it would be quite possibly a solution.

Edit: Come to think of it, I have no TSS. Would a TSS be necessary for what implementing ring 3 tasks even though I'm using software task switching as opposed to TSS-based? If so, what would be a nonintrusive place holder I could set it to since I'm not using the TSS?

Re:Switching Segments Causes Page Fault

Posted: Fri Nov 18, 2005 7:04 pm
by JAAman
Would a TSS be necessary for what implementing ring 3 tasks even though I'm using software task switching as opposed to TSS-based? If so, what would be a nonintrusive place holder I could set it to since I'm not using the TSS?
most definately!

any time you use ring3 you must have a TSS:
the CPU will automatically fetch the ss:esp from the TSS for a transition ->any ring != 3, therefore a TSS is required as soon as you start working in any ring != 0


[ref]
vol.3, 5.12.1 (pg 5-15 -- though im currently using an outdated version)
[/ref]


The stack push/pop pairing is definitely not the problem if the isolating factor is the ring 3 tasks. If the isolating factor were say multitasking altogether, then it would be quite possibly a solution.
didn't mean to imply that in my last post, i was simply requesting help understanding c calling convention, as it was my understanding that it returned with a ret

however on rechecking vol.2:
the x86 has a 'ret imm8' form of return which i'd forgotten about (never used it), however (afaik) most CPUs don't have this form (though i could be wrong, my knowledge is limited when it comes to other architectures)

Re:Switching Segments Causes Page Fault

Posted: Sat Nov 19, 2005 10:34 am
by TheChuckster
Okay I have a TSS, please verify that this is correct code. You really can't be sure when you're writing an OS. Sure it compiles, but is it working? Notice that I have set SS0 to my ring 0 data segment selector.

Code: Select all

global tss
tss:
   dw 0, 0         ; back link
tss_esp0:
   esp0 dd 0      ; ESP0
   dw 10h, 0      ; SS0, reserved

   dd 0         ; ESP1
   dw 0, 0         ; SS1, reserved

   dd 0         ; ESP2
   dw 0, 0         ; SS2, reserved

   dd 0         ; CR3
   dd 0, 0         ; EIP, EFLAGS
   dd 0, 0, 0, 0      ; EAX, ECX, EDX, EBX
   dd 0, 0, 0, 0      ; ESP, EBP, ESI, EDI
   dw 0, 0         ; ES, reserved
   dw 0, 0         ; CS, reserved
   dw 0, 0         ; SS, reserved
   dw 0, 0         ; DS, reserved
   dw 0, 0         ; FS, reserved
   dw 0, 0         ; GS, reserved
   dw 0, 0         ; LDT, reserved
   dw 0, 103      ; debug, IO permission bitmap
Cut to where I'm loading my GDT. I am really unsure about this part. What this code does is sets the esp0 to the current (ring 0) esp value and loads the TSS selector.

Code: Select all

lgdt [gp]

   mov [esp0], esp
   mov ax, 0x28
   ltr ax
I stuck in a few lines to store the ring 0 esp in my TSS and load the TSS selector. Is that all I need to do to have a working TSS? Do I need to stick code in my interrupt handler for anything TSS-related?

New GDT (should I just scrap the C gdt code and go to assembly?):

Code: Select all

    gdt_set_gate(0, 0, 0, 0, 0);
    gdt_set_gate(1, 0, 0xFFFFFFFF, 0x9A, 0xCF); // Ring 0 (kernel) CS and DS
    gdt_set_gate(2, 0, 0xFFFFFFFF, 0x92, 0xCF);

    gdt_set_gate(3, 0, 0xFFFFFFFF, 0xFA, 0xCF); // Ring 3 (user processes) CS and DS
    gdt_set_gate(4, 0, 0xFFFFFFFF, 0xF2, 0xCF);

    gdt_set_gate(5, tss, tss+103, 0x89, 0xCF); // TSS descriptor
Okay, is this enough to have a working TSS? Doesn't seem like it because I'm still getting a GPF and Bochs CPU state dump is saying it's invalid:

Code: Select all

ldtr:s=0x0000, dl=0x00000000, dh=0x00000000, valid=0
tr:s=0x0028, dl=0x00000067, dh=0x00808900, valid=1
I noticed I forgot to set my EFLAGS for my user tasks to have a ring 3 IOPL. I now start out with an EFLAGS value of 0x3202.

However, with these new changes in place, I am STILL getting a GPF when I 'iret'. I probably am missing something obvious but necessary with the Intel architecture to have this working. Could anyone clue me in?

How much does it cost to order a printed version of the Intel manual? The "shopping cart" they have doesn't display the price.

Re:Switching Segments Causes Page Fault

Posted: Sat Nov 19, 2005 11:54 am
by Cjmovie
I'm not 100% sure you're setting the TSS descriptor in there correctly, make sure it follows the format in section 6.2.2 of the Intel Manuals (Not sure for AMD one).

Re:Switching Segments Causes Page Fault

Posted: Sat Nov 19, 2005 2:15 pm
by Cjmovie
I'm rather sure theres something fishy with the value of 0xCF -> If I'm correct, this translates into the area that contains the Granularity bit etc.. Meaning, in binary it is:

11001111

Which would define the wrong segment. TSS's should have the following structure for this byte:
G - most significant bit (1 bit)
"00" - 2 bits set to 0. Intel 'reserved'
AVL - 1 bit, Avaliable for whatever you want it for
Limit - bits 19-16 of limit for segment

So yours would say it's present, has a DPL of 2, a type of "1111", with the reserved bit set to "0".

So you instead want to use a value (to replace the 0xCF) like 0x80. Then, because the 'G' field is set, you should set the limit to 103, not TSS+103 (because G makes it an offset from the base address, IIRC)

Of course, I could be totally wrong

Re:Switching Segments Causes Page Fault

Posted: Sun Nov 20, 2005 1:21 pm
by Candy
TheChuckster wrote: How much does it cost to order a printed version of the Intel manual? The "shopping cart" they have doesn't display the price.
A finite amount of nothing.

It's free. Even of shipping etc, and even to europe. At least, it was 2 years ago.

Re:Switching Segments Causes Page Fault

Posted: Sun Nov 20, 2005 1:29 pm
by Candy
Cjmovie wrote: So you instead want to use a value (to replace the 0xCF) like 0x80. Then, because the 'G' field is set, you should set the limit to 103, not TSS+103 (because G makes it an offset from the base address, IIRC)

Of course, I could be totally wrong
I expect you're wrong.

The granularity bit says something about the size of the granule, which is the minimum size you can divide. It makes the limit a count of pages as opposed to a count of bytes (the granule that is). That allows you to express a 32-bit limit in only 20 bits of limit. You can also only map whole granules (pages), so the limit is, if you set it to 0, from 0 to 0xFFF.

The settings are as far as that goes correct. They should create 0-based segments with a limit of 0xFFFFF. I'm wondering though, why are you using 8 F's for the limit? Did you swap the limit and the base around? The base is 32-bits, the limit is 20 bits (of which 4 are buried in the byte under discussion).