Page 1 of 2
GDT problems while switching to user-mode
Posted: Tue Mar 22, 2016 4:56 pm
by heat
Hello OSDev community,
I am here to ask again for help. I don't come here for a few months because I simply could find an answer without bothering you guys. But now the time has come
.
I am trying to implement user-space threads in my kernel ( running in CPL=3 ). Basically what I'm trying to do is creating a task within my scheduler, but running with ring 3 segments and a stack in the user-space addresses ( the scheduler already works with kernel-mode threads ). But as soon as i make the call to sched_create_task ( the function that spawns a new task inside the scheduler).
Also in case you are wondering, task is the same thing as a thread.
My problem is that i get a GPF when the scheduler is switching tasks, with the error code indicating that the segment 0x20 doesn't exist in my GDT.
This is my GDT ( indicated by bochs ):
Code: Select all
Global Descriptor Table (base=0x00000000c0111aa0, limit=47):
GDT[0x00]=??? descriptor hi=0x00000000, lo=0x00000000
GDT[0x01]=Code segment, base=0x00000000, limit=0xffffffff, Execute/Read, Non-Conforming, Accessed, 32-bit
GDT[0x02]=Data segment, base=0x00000000, limit=0xffffffff, Read/Write, Accessed
GDT[0x03]=Code segment, base=0x00000000, limit=0xffffffff, Execute/Read, Non-Conforming, Accessed, 32-bit
GDT[0x04]=Data segment, base=0x00000000, limit=0xffffffff, Read/Write
GDT[0x05]=32-Bit TSS (Busy) at 0xc0111a20, length 0x11a88
Bochs says that its not a valid code segment, indicated by the following line from output:
Code: Select all
00175322448e[CPU0 ] check_cs(0x0023): not a valid code segment !
I'm leaving links to relevant parts of my code on github:
https://github.com/PedroFalcato/Spartix ... l/kernel.c - line 195
https://github.com/PedroFalcato/Spartix ... cheduler.c
I think i supplied enough information, but if you need more, just ask.
Thanks for all the help,
TheRussianFail
Re: GDT problems while switching to user-mode
Posted: Tue Mar 22, 2016 5:58 pm
by MDenham
Well, yeah, 0x23 isn't a valid code segment. It is a valid data segment, however... (You want 0x1b for the code segment.)
Re: GDT problems while switching to user-mode
Posted: Tue Mar 22, 2016 7:23 pm
by heat
MDenham wrote:Well, yeah, 0x23 isn't a valid code segment. It is a valid data segment, however... (You want 0x1b for the code segment.)
Well, you were absolutely right. That typo was embarassing. But the GPF still exists, but now Bochs is complaining about this:
Code: Select all
00176635879e[CPU0 ] load_seg_reg(SS): rpl != CPL
I went searching for a bit, and i think the problem is related to what exists in the stack, because acording to
this, when the iret is executing, and its switching to a lower priority ring, it requires the SS and ESP to be on the stack?
If so, why isn't this working
Code: Select all
void sched_create_task(task_t * task, void (*thread) (), uint32_t cs,
uint32_t ss)
{
unsigned int *stack = NULL;
task->regs.esp = (uint32_t) valloc(2) + 8192;
if (!task->regs.esp)
abort();
stack = (unsigned int *) task->regs.esp;
unsigned int *original_stack = stack;
// Push the return address
*--stack = (unsigned int) &_exit_task;
*--stack = ss;
*--stack = (unsigned int)original_stack - 4;
//First, this stuff is pushed by the processor
*--stack = 0x0202; //This is EFLAGS
*--stack = cs; //This is CS, our code segment
*--stack = (unsigned int) thread; //This is EIP
//... more irrelevant context setting
Best Regards,
TheRussianFail
Re: GDT problems while switching to user-mode
Posted: Wed Mar 23, 2016 4:28 am
by MDenham
The solution is given later on in that thread: you need to have your code descriptors marked as conforming instead.
Re: GDT problems while switching to user-mode
Posted: Wed Mar 23, 2016 6:58 am
by heat
Thanks, now its switching privilege rings.
Now my problem is that its triggering a #PF, and its not even calling the interrupt handlers, it just triple faults. Bochs says its about to execute the user-space test function.
Code: Select all
void test()
{
asm volatile("mov $0xDEADBEEF,%eax"); <-- Bochs says this is the next intruction
while(1);
}
And the CR2 is complaining about an address inside the stack. I've made sure that every stack is user-accessible ( just for testing though ). I've checked the disassembly, and the function is not even accessing the stack.
Bochs' log:
Code: Select all
00018233839i[BIOS ] Booting from 07c0:0000
00103871201i[BXVGA ] VBE set bpp (32)
00103871223i[BXVGA ] VBE set xres (640)
00103871262i[BXVGA ] VBE set yres (480)
00103871300i[BXVGA ] VBE enabling x 640, y 480, bpp 32, 1228800 bytes visible
00154443007i[BXVGA ] VBE disabling
00154671704i[BXVGA ] VBE set bpp (32)
00154671726i[BXVGA ] VBE set xres (1024)
00154671765i[BXVGA ] VBE set yres (768)
00154671803i[BXVGA ] VBE enabling x 1024, y 768, bpp 32, 3145728 bytes visible
00170339394i[SER ] com1: FIFO enabled
00171834141i[CPU0 ] CPU is in protected mode (active)
00171834141i[CPU0 ] CS.mode = 32 bit
00171834141i[CPU0 ] SS.mode = 32 bit
00171834141i[CPU0 ] EFER = 0x00000000
00171834141i[CPU0 ] | EAX=00000000 EBX=00000000 ECX=00000000 EDX=00000000
00171834141i[CPU0 ] | ESP=c0f03ffc EBP=00000000 ESI=00000000 EDI=00000000
00171834141i[CPU0 ] | IOPL=0 id vip vif ac vm RF nt of df IF tf sf zf af pf cf
00171834141i[CPU0 ] | SEG sltr(index|ti|rpl) base limit G D
00171834141i[CPU0 ] | CS:001b( 0003| 0| 3) 00000000 ffffffff 1 1
00171834141i[CPU0 ] | DS:0023( 0004| 0| 3) 00000000 ffffffff 1 1
00171834141i[CPU0 ] | SS:0023( 0004| 0| 3) 00000000 ffffffff 1 1
00171834141i[CPU0 ] | ES:0023( 0004| 0| 3) 00000000 ffffffff 1 1
00171834141i[CPU0 ] | FS:0023( 0004| 0| 3) 00000000 ffffffff 1 1
00171834141i[CPU0 ] | GS:0023( 0004| 0| 3) 00000000 ffffffff 1 1
00171834141i[CPU0 ] | EIP=c0101f90 (c0101f90)
00171834141i[CPU0 ] | CR0=0xe0000013 CR2=0xc0f03ff8
00171834141i[CPU0 ] | CR3=0x003f2000 CR4=0x00000610
(0).[171834141] [0x000000101f90] 001b:00000000c0101f90 (unk. ctxt): mov eax, 0xdeadbeef ; b8efbeadde
00171834141p[CPU0 ] >>PANIC<< exception(): 3rd (14) exception with no resolution
00171834141e[CPU0 ] WARNING: Any simulation after this point is completely bogus !
Next at t=171834142
(0) [0x000000101f90] 001b:00000000c0101f90 (unk. ctxt): mov eax, 0xdeadbeef ; b8efbeadde
EDIT: All the memory that is mapped ( supplied by Bochs ) :
Code: Select all
<bochs:3> info tab
cr3: 0x0000003f2000
0x00001000-0x003fffff -> 0x000000001000-0x0000003fffff
0x00c00000-0x00ffffff -> 0x000000c00000-0x000000ffffff
0xc0000000-0xc0bfffff -> 0x000000000000-0x000000bfffff
0xc0e00000-0xc0f03fff -> 0x000001001000-0x000001104fff
0xe0000000-0xe03fffff -> 0x0000e0000000-0x0000e03fffff
0xffc00000-0xffc00fff -> 0x0000003f0000-0x0000003f0fff
0xffc03000-0xffc03fff -> 0x000000c01000-0x000000c01fff
0xfff00000-0xfff00fff -> 0x0000003f1000-0x0000003f1fff
0xfff01000-0xfff01fff -> 0x000000400000-0x000000400fff
0xfff02000-0xfff02fff -> 0x000000800000-0x000000800fff
0xfff03000-0xfff03fff -> 0x000000c00000-0x000000c00fff
0xfff80000-0xfff80fff -> 0x0000003f3000-0x0000003f3fff
0xfffff000-0xffffffff -> 0x0000003f2000-0x0000003f2fff
Re: GDT problems while switching to user-mode
Posted: Wed Mar 23, 2016 7:46 am
by sebihepp
Did you setup a Local Descriptor Table and load it? Did you setup your Kernel Stack?
I think there is an interrupt happening just before the instruction "movl $0xDEADBEEF, %eax" is called. That's why it accesses the stack. Also the address is CR2 is not mapped.
Re: GDT problems while switching to user-mode
Posted: Wed Mar 23, 2016 7:56 am
by MDenham
sebihepp wrote:Did you setup a Local Descriptor Table and load it? Did you setup your Kernel Stack?
I think there is an interrupt happening just before the instruction "movl $0xDEADBEEF, %eax" is called. That's why it accesses the stack. Also the address is CR2 is not mapped.
The address in CR2
is mapped. It may, however, be marked as not present.
Also, a disassembly of the instructions prior to the problem would be useful; the function prologue will access the stack as well.
Re: GDT problems while switching to user-mode
Posted: Wed Mar 23, 2016 9:11 am
by heat
MDenham wrote:sebihepp wrote:Did you setup a Local Descriptor Table and load it? Did you setup your Kernel Stack?
I think there is an interrupt happening just before the instruction "movl $0xDEADBEEF, %eax" is called. That's why it accesses the stack. Also the address is CR2 is not mapped.
The address in CR2
is mapped. It may, however, be marked as not present.
Also, a disassembly of the instructions prior to the problem would be useful; the function prologue will access the stack as well.
No, I don't think there is an interrupt happening before that instruction, as I made sure to disable them after seeing it triple fault:
Code: Select all
unsigned int sched_switch_task(unsigned int old_esp)
{
if (current_task != NULL) {
//Were we even running a task?
current_task->regs.esp = old_esp; //Save the new esp for the thread
current_task = current_task->next;
if(current_task->is_kernel == false){
asm volatile("cli");
}
Also, if i remember correctly ( I'm not in my development machine now ), the test() function had no prologue and no epilogue, it had only the mov instruction and the loop. I will make sure to confirm this later.
Finally, I don't think the page is marked non-present, as other threads ( threads running in CPL0 ) ran before it, and could access the stack perfectly ( again, will confirm this in a few minutes ).
Re: GDT problems while switching to user-mode
Posted: Wed Mar 23, 2016 9:44 am
by MDenham
Oh! I think I know what the problem is. Check your NX bits on page table entries.
Re: GDT problems while switching to user-mode
Posted: Wed Mar 23, 2016 9:50 am
by iansjack
When you call the function it accesses the stack to save the return address.
Re: GDT problems while switching to user-mode
Posted: Wed Mar 23, 2016 10:06 am
by heat
MDenham wrote:Oh! I think I know what the problem is. Check your NX bits on page table entries.
I'm using 32-bit paging without PAE, I don't have NX.
iansjack wrote:When you call the function it accesses the stack to save the return address.
I'm not calling any functions, I'm iret'ing to the function. The return address of the thread is already stored in the stack.
But what I think is the most strange is that the #PF handler isn't called during the page fault, although the kernel is still mapped. Instead it just triple-faults.
Re: GDT problems while switching to user-mode
Posted: Wed Mar 23, 2016 10:23 am
by BrightLight
TheRussianFail wrote:But what I think is the most strange is that the #PF handler isn't called during the page fault, although the kernel is still mapped. Instead it just triple-faults.
Do you have a TSS?
Re: GDT problems while switching to user-mode
Posted: Wed Mar 23, 2016 10:31 am
by heat
omarrx024 wrote:TheRussianFail wrote:But what I think is the most strange is that the #PF handler isn't called during the page fault, although the kernel is still mapped. Instead it just triple-faults.
Do you have a TSS?
Yes I do, but Bochs says it's busy ( I'm not sure it should be ).
Code: Select all
<bochs:2> info gdt
Global Descriptor Table (base=0x00000000c0111aa0, limit=47):
GDT[0x00]=??? descriptor hi=0x00000000, lo=0x00000000
GDT[0x01]=Code segment, base=0x00000000, limit=0xffffffff, Execute/Read, Conforming, Accessed, 32-bit
GDT[0x02]=Data segment, base=0x00000000, limit=0xffffffff, Read/Write, Accessed
GDT[0x03]=Code segment, base=0x00000000, limit=0xffffffff, Execute/Read, Conforming, Accessed, 32-bit
GDT[0x04]=Data segment, base=0x00000000, limit=0xffffffff, Read/Write, Accessed
GDT[0x05]=32-Bit TSS (Busy) at 0xc0111a20, length 0x11a88
You can list individual entries with 'info gdt [NUM]' or groups with 'info gdt [NUM] [NUM]'
TSS code:
Code: Select all
static void install_tss()
{
uint32_t base = (uint32_t) & tss_entry;
uint32_t limit = base + sizeof(tss_entry);
create_descriptor(5, base, limit, 0xE9, 0);
memset(&tss_entry, 0, sizeof(tss_entry));
tss_entry.ss0 = 0x10;
tss_entry.esp0 = (uint32_t) 0xC03FFF00;
tss_entry.cs = 0x0b;
tss_entry.iomap_base = sizeof(tss_entry_t);
tss_entry.ss = tss_entry.ds = tss_entry.es = tss_entry.fs =
tss_entry.gs = 0x13;
// TSS is flushed in a different section of the code
}
Re: GDT problems while switching to user-mode
Posted: Wed Mar 23, 2016 10:35 am
by BrightLight
I'm not so sure that's correct. It took me a long time to get my old 32-bit TSS to work:
Code: Select all
align 32
; tss:
; Task State Segment
tss:
.prev_tss dd 0
.esp0 dd 0
.ss0 dd 0x10 ; kernel stack segment
.esp1 dd 0
.ss1 dd 0
.esp2 dd 0
.ss2 dd 0
.cr3 dd page_directory ; CR3 is the same for kernel and user
.eip dd 0 ; these values are also used for hardware multitasking
.eflags dd 0
.eax dd 0
.ecx dd 0
.edx dd 0
.ebx dd 0
.esp dd 0
.ebp dd 0
.esi dd 0
.edi dd 0
.es dd 0x10 ; kernel data segments
.cs dd 0x08
.ss dd 0x10
.ds dd 0x10
.fs dd 0x10
.gs dd 0x10
.ldt dd 0
.trap dw 0
.iomap_base dw 104 ; prevent user programs from using IN/OUT instructions
Its GDT entry:
Code: Select all
; tss segment 0x38
dw 104
dw tss
db 0
db 11101001b ; access byte, took me a long time to get right ;)
db 0
db 0
BTW, when you load the TSS, do you OR it with 3?
Re: GDT problems while switching to user-mode
Posted: Wed Mar 23, 2016 10:44 am
by heat
omarrx024 wrote:I'm not so sure that's correct. It took me a long time to get my old 32-bit TSS to work:
Code: Select all
align 32
; tss:
; Task State Segment
tss:
.prev_tss dd 0
.esp0 dd 0
.ss0 dd 0x10 ; kernel stack segment
.esp1 dd 0
.ss1 dd 0
.esp2 dd 0
.ss2 dd 0
.cr3 dd page_directory ; CR3 is the same for kernel and user
.eip dd 0 ; these values are also used for hardware multitasking
.eflags dd 0
.eax dd 0
.ecx dd 0
.edx dd 0
.ebx dd 0
.esp dd 0
.ebp dd 0
.esi dd 0
.edi dd 0
.es dd 0x10 ; kernel data segments
.cs dd 0x08
.ss dd 0x10
.ds dd 0x10
.fs dd 0x10
.gs dd 0x10
.ldt dd 0
.trap dw 0
.iomap_base dw 104 ; prevent user programs from using IN/OUT instructions
Its GDT entry:
Code: Select all
; tss segment 0x38
dw 104
dw tss
db 0
db 11101001b ; access byte, took me a long time to get right ;)
db 0
db 0
BTW, when you load the TSS, do you OR it with 3?
Is this correct?
Code: Select all
global tss_flush
tss_flush:
mov ax, 2Bh
ltr ax
ret