Page 1 of 1
Please Help!! Software Multitasking
Posted: Mon Oct 29, 2007 11:37 am
by PinkyNoBrain
Hi, i have a working protected mode kernel with a memory mannager(uses paging). I have been trying to implement software multitasking.
i create a TSS for the system to use, install it into my gdt, load it with ltr.
I set the PIC to gennerate intterupts at regular intervals and have a handler that catches them, pushes registers,saves stack position, checks to see if it should switch task, loads cr3,loads stack position,pops registers then ireturns.
i create two processes, fill in some generic details for both, and create a page directory for both.
I switch into each processes Address space and setup there stack to mimmic a interrupt followed by pushad instructions.
At this point i can switch to either processes address space, set esp to there stack pointer, do popad followed by iret and that process will start running. That works fine.
The problem occurs when i try to switch processes after a timer event. It works when no process switch occurs. ie i decrement the timeslice. But as soon as i try to switch stack and switch address space to switch task i get a General Protection Fault(0) exception.
I run my os in bochs and am happy to post what ever code or output people would find helpfull. I would really appreciate any help you could give
(quick update) It is clearly something wrong with my switching address spaces. any attempt to write cr3 during my interupt handler causes the GP. it should be noted though that both address spaces are valid, i can switch to them in other parts of my kernel just fine.
Posted: Mon Oct 29, 2007 11:58 am
by JamesM
I think it's time to put your debugging shoes on. The fact that you have updated so quickly shows me you are currently debugging this problem.
1. I will help tomorrow if it is not sorted.
2. Please read the sticky post about 'first time posters', especially r.e. forum topics such as 'PLEASE HELP'.
JamesM
Posted: Mon Oct 29, 2007 12:12 pm
by TomTom
Are you sure that you update cr3 with a physical address, not a virtual one?
Posted: Wed Oct 31, 2007 11:54 am
by PinkyNoBrain
Hey, thanks for the speedy replies, this problem is really annoying. I think im going to have to post some my code. Bellow is my first (and most simple attempt) to get the context switching to work. i have tried modifying it in several ways which produced diffrent results, but since none of them worked i think its better i just give you the most simple stuff.
This is a shortend copy of my code that sets up a new process, it creates the address space for the process, switches into it, then sets up the stack to mimic an interrupted process, updates fields in the processes structure, then switches back to the kernel address space.
Code: Select all
void processInitializeProcess(unsigned long entrypoint,char*name,ProcessStructure* p)
{
p->ID = 0;
p->ProcessName = name;
p->Priority = 5;
p->State = 0;
p->PageDirectoryAddress = createPageDirectory();
unsigned long oldpd = read_cr3();
write_cr3(p->PageDirectoryAddress);
unsigned long* stacksetup = (unsigned long*)((1022 << 22)+( 1022<< 12));
//mimic interupt stackframe
stacksetup--;
*stacksetup=0x0202;
stacksetup--;
*stacksetup=0x08;
stacksetup--;
*stacksetup=entrypoint;
//Mimics pushad
stacksetup--;
*stacksetup = 0;
stacksetup--;
*stacksetup = 0;
stacksetup--;
*stacksetup = 0;
stacksetup--;
*stacksetup = 0;
stacksetup--;
*stacksetup = 0;
stacksetup--;
*stacksetup = 0;
stacksetup--;
*stacksetup = 0;
stacksetup--;
*stacksetup = 0;
p->StackPointer = (unsigned long) stacksetup;
write_cr3(oldpd);
return;
}
i do this twice to create two processes one called testprocess, the other called idleprocess, both just sit in a loop printing to the screen. i can set either one to be the CurrentProcess then call the following and it will jump into it and start running, no problems so far.
Code: Select all
void StartMultiTasking()
{
CurrentProcessPageDirectory = CurrentProcess->PageDirectoryAddress;
write_cr3(CurrentProcess->PageDirectoryAddress);
asm ("movl %0, %%esp\n\t" : :"r"(CurrentProcess->StackPointer));
asm("popa\n\t");
asm("iret\n\t");
}
now i need to do a context switch, the following asm is called directly apon a timer interrupt.
Code: Select all
timer_stub:
cli
pushad
mov eax, esp
mov [CurrentProcessStack], eax
mov eax, cr3
mov [CurrentProcessPageDirectory], eax
call TaskHandler
mov eax,[CurrentProcessPageDirectory]
mov cr3,eax
mov esp,[CurrentProcessStack]
mov al,0x20
out 0x20,al
popad
sti
iret
and this is the task handler code called by the above asm.
Code: Select all
ProcessStructure* CurrentProcess;
unsigned int TimeSlice = 55;
static unsigned long CurrentStack;
static unsigned long CurrentPD;
unsigned long* CurrentProcessPageDirectory = & CurrentPD;
unsigned long* CurrentProcessStack = &CurrentStack;
void TaskHandler()
{
TimeSlice--;
//VGADriver::print("TICK");
if(TimeSlice > 0) return;
//VGADriver::print("TOCK");
CurrentProcess->PageDirectoryAddress = CurrentPD;
CurrentProcess->UStackPointer = CurrentStack;
CurrentProcess = CurrentProcess->nextProcess;
TimeSlice = 55;
//Check to see if current process is set to be the next process(which it is)
VGADriver::print(CurrentProcess->ProcessName);
CurrentPD = CurrentProcess->PageDirectoryAddress;
CurrentStack = CurrentProcess->UStackPointer;
return;
}
Very embarisingly the task switching code does nothing. It prints TICK TOCK and the process names as it should, but it never actually switches task. Various attempts have been made to fix it, some of which genneratd a genneral protection fault 0 exception, i expect the problem is in the asm cos thats what im worst at
. Pls could somebody unscramble this mess
Posted: Wed Oct 31, 2007 3:14 pm
by Combuster
I expect the problem is in the asm cos thats what im worst at
It's your C code I'm concerned about.
You declare a static variable (which means the scope is limited to the compilation unit), then add a global reference to that. This is just bad coding practice. The actual error is however a few lines further:
Code: Select all
unsigned long* CurrentProcessPageDirectory = & CurrentPD;
unsigned long* CurrentProcessStack = &CurrentStack;
This is a pointer to a long - strangely enough a page directory isn't a number, but a 4k table. more appropriately this would be a void * pointer.
The bug is hidden in the fact that your ASM changes CurrentProcessPageDirectory and CurrentProcessStack. The C code changes CurrentPD and CurrentStack. since these are distinct variables, neither influences the other.
CurrentProcessPageDirectory will just not point to currentPD after the first timer call.
What you need to do is use CurrentStack and CurrentPD in both C and ASM, and declare them as unsigned long CurrentXX;
Posted: Thu Nov 01, 2007 8:27 am
by PinkyNoBrain
Thanks for the reply combuster i have taken what you said on board and removed the static key word for both CurrentStack and CurrentPD for the sake of good practice
. However, the following im not so sure about
This is a pointer to a long - strangely enough a page directory isn't a number, but a 4k table. more appropriately this would be a void * pointer.
you are certanly correct in your assertion that a PD is a table not a number, however the CurrentPD variable intends only to store the physicaladdress of the PD in the underlying memory,which is just a number, so that it can be loaded into cr3.
What you need to do is use CurrentStack and CurrentPD in both C and ASM, and declare them as unsigned long CurrentXX;
allas that is wat i was intending to do, however it doesn't work because mov doesn't allow it. Mov only excepts an immdediate value ,register or a memory address so any attempt to use CurrentPD or CurrentStack directly yields an invalid opcode opperand error on compilation. I therefor created the two pointers CurrentProcessPageDirectory and CurrentProcessStack to point to them. The asm code isn't meant to alter the values of the pointers, its meant to alter the value of what there pointing at.
I know im no expert, it could well be that my code is doing something other than intended.
Posted: Thu Nov 01, 2007 12:20 pm
by Combuster
PinkyNoBrain wrote:so any attempt to use CurrentPD or CurrentStack directly yields an invalid opcode opperand error on compilation.
Bull. CurrentPD etc
represents an immediate value. Therefore this piece of code compiles as expected. (I verified that)
Code: Select all
extern CurrentStack
extern CurrentPD
extern TaskHandler
timer_stub:
cli
pushad
mov eax, esp
mov [CurrentStack], eax
mov eax, cr3
mov [CurrentPD], eax
call TaskHandler
mov eax,[CurrentPD]
mov cr3,eax
mov esp,[CurrentStack]
mov al,0x20
out 0x20,al
popad
sti
iret
Posted: Thu Nov 01, 2007 3:34 pm
by PinkyNoBrain
Ok combuster i take it all back, the above code does compile fine, i must of done something stupid when i tried it before. Unfortunatley my os still doesn't work, i now get a General Protection fault when the task switch should occur, the GP definetly occurs on the IRET. I have been reading the manuals trying to work out why but i cant figure out why it works fine if i just jump straight into it form the kernel. but doesn't work when i try to switch to it from another task. Could it perhaps be to do with updating or reseting fields in the TSS. or is my stack building code wrong for a task switch once multitasking has started( all the processes run at the same priviledge level so i dont know why this would be). Id appreciate any clues anyone could give
.
[Edited] AHA, breakthrough, the last thing in my bochs outputfile is "IRET: return CS selector null" . Ok, so now i know what the problems is but still cant get how to fix this. It must be that my stack is getting out of alignment somehow, but wats causing it given that it works straight from the kernel.
Posted: Thu Nov 01, 2007 5:10 pm
by Combuster
Given your previous code, it appears that you do not update tss.esp0 as part of the task switch. While this isn't necessary for kernel threads, once in userland it must match the top of the stack or you will end up overwriting memory you used for other purposes.
Posted: Fri Nov 02, 2007 2:01 pm
by JAAman
Combuster wrote:Given your previous code, it appears that you do not update tss.esp0 as part of the task switch. While this isn't necessary for kernel threads, once in userland it must match the top of the stack or you will end up overwriting memory you used for other purposes.
actually, it isnt
necessary there either -- just common
it really depends on your design, i know at least 2 of us use designs which do
not require altering TSS.ESP0 at all
however, for most people, you are correct, ESP0 will need to be changed on each task-switch
Posted: Sat Nov 03, 2007 10:48 am
by PinkyNoBrain
Ok guys, thanks for all the help. Ive almost got it all working now, it was a stupid bug in my C code that im to embarrised to point out, but Kernel thread task switching now works perfectly. I am however still getting a GP fault when trying to start a user mode thread. I did wat was suggested above and updated the TSS on task switches, i also know you need to have the esp and ss on the stack for a switch to userspace. I am slightly unclear as to which way round, shoudld it be IP CS EFLAGS ESP SS or IP CS EFLAGS SS ESP, two tutorials ive read dissagree and the intel manual is a little unclear, so i just want to be sure. Also, is another possibility for me getting a GP fault when i iret into the task for the first time because the task is staticaly linked into the kernel and i i map the kernel into my pagedirectory with system rather than user access flags. Would this cause a GP fault or a Page fault, hmm? Just to be clear on the changes ill repost my modifyed code..
Initializing the processes stack is now.....
Code: Select all
//Used when creating a process, function replicates the stack from that would have resulted from a Timer Interrupt.
void processInitializeProcess(unsigned long entrypoint,char*name,ProcessStructure* p)
{
p->ID = 0;
p->ProcessName = name;
p->Priority = 5;
p->State = 0;
p->PageDirectoryAddress = createPageDirectory();
unsigned long oldpd = read_cr3();
write_cr3(p->PageDirectoryAddress);
unsigned long* stacksetup = (unsigned long*)((1022 << 22)+( 1022<< 12));
//mimic interupt stackframe
stacksetup--;
*stacksetup = 0x20;//ss
stacksetup--;
*stacksetup = ((1022 << 22)+( 510<< 12));//esp
stacksetup--;
*stacksetup=0x0202; //eflags
stacksetup--;
*stacksetup= 0x18;//a user level CS, 0x08 was ring 0, put it back in and the whole thing works
stacksetup--;
*stacksetup=entrypoint;//ip
//Mimics pushad
stacksetup--;
*stacksetup = 0;
stacksetup--;
*stacksetup = 0;
stacksetup--;
*stacksetup = 0;
stacksetup--;
*stacksetup = 0;
stacksetup--;
*stacksetup = 0;
stacksetup--;
*stacksetup = 0;
stacksetup--;
*stacksetup = 0;
stacksetup--;
*stacksetup = 0;
//push segments
stacksetup--;
*stacksetup=0x20;
stacksetup--;
*stacksetup=0x20;
stacksetup--;
*stacksetup=0x20;
stacksetup--;
*stacksetup=0x20;
p->KStackPointer = (unsigned long) stacksetup;
write_cr3(oldpd);
return;
}
Starting the multitasking(where i get the error)
Code: Select all
void StartMultiTasking()
{
write_cr3(CurrentProcess->PageDirectoryAddress);
asm ("movl %0, %%esp\n\t" : :"r"(CurrentProcess->KStackPointer));
asm("pop %ds\n\t");
asm("pop %es\n\t");
asm("pop %fs\n\t");
asm("pop %gs\n\t");
asm("popa\n\t");
asm("iret\n\t");
}
modifyied TaskHandler(we never get hear)
Code: Select all
void TaskHandler()
{
TimeSlice--;
//VGADriver::print("Tick");
if(TimeSlice > 0) return;
//VGADriver::print("Tock");
CurrentProcess->PageDirectoryAddress = CurrentPD;
CurrentProcess->KStackPointer = CurrentStack;
CurrentProcess = CurrentProcess->nextProcess;
TimeSlice = 30;
SysTSS.esp0 = CurrentProcess->KStackPointer;
SysTSS.cr3 = CurrentProcess->PageDirectoryAddress;
CurrentPD = CurrentProcess->PageDirectoryAddress;
CurrentStack = CurrentProcess->KStackPointer;
return;
}
Posted: Sat Nov 03, 2007 2:51 pm
by frank
Well I'll just post the code I use to set up the kernel stack for my user mode code.
Code: Select all
*esp-- = 32 + 3; // user ss
*esp-- = 0x50000000; // user esp
*esp-- = 0x202; // user eflags
*esp-- = 24 + 3; // user cs
*esp = 0x40000000; // user eip
The stack will be pointed at esp when I issue the iret.
After looking at your code I think I see the problem. You are pushing the wrong cs and ss on the stack. The value you are pushing has the lower 2 bits clear. See the values I am pushing on the stack? Try setting the lower 2 bits and see if that works.
Posted: Sat Nov 03, 2007 6:58 pm
by iammisc
segment selectors in protected mode are offsets into the gdt and therefore are always aligned on an 8 byte boundary. That means the offset's last 3 bits are 0 and unused. These 3 bits store information about the current privilege level. The privilege level of segment 0x20 in the gdt might be 3 but the segment selector 0x20 will still run in kernel mode. Because this has the same privilege level as the kernel the esp and ss are not popped off because in order for ss and esp to be popped, cs must be at a different privilege level.
The last 2 bits of the segment selector represent the current privilege level and the 3rd bit is the table the selector is in(the gdt or ldt). Therefore for your segment selectors to truly be in userspace they should be 0x1b for cx and 0x23 for ss and the general purpose segment selectors. This will change the cpl to 3.
And the order is eip cs eflags esp ss
Posted: Sun Nov 04, 2007 12:15 pm
by PinkyNoBrain
Good advice guys, must have missed that bit in the manuals. i have changed cs to 0x1b ans ss to 0x23 as suggested. I am still getting the GP fault though, its very annoying.
Posted: Sun Nov 04, 2007 2:04 pm
by AJ
Hi,
Can we see the Bochs register dump following your GP fault? I think that would help a lot.
Cheers,
Adam