Please Help!! Software Multitasking
-
- Posts: 22
- Joined: Mon Oct 29, 2007 10:49 am
Please Help!! Software Multitasking
Hi, i have a working protected mode kernel with a memory mannager(uses paging). I have been trying to implement software multitasking.
i create a TSS for the system to use, install it into my gdt, load it with ltr.
I set the PIC to gennerate intterupts at regular intervals and have a handler that catches them, pushes registers,saves stack position, checks to see if it should switch task, loads cr3,loads stack position,pops registers then ireturns.
i create two processes, fill in some generic details for both, and create a page directory for both.
I switch into each processes Address space and setup there stack to mimmic a interrupt followed by pushad instructions.
At this point i can switch to either processes address space, set esp to there stack pointer, do popad followed by iret and that process will start running. That works fine.
The problem occurs when i try to switch processes after a timer event. It works when no process switch occurs. ie i decrement the timeslice. But as soon as i try to switch stack and switch address space to switch task i get a General Protection Fault(0) exception.
I run my os in bochs and am happy to post what ever code or output people would find helpfull. I would really appreciate any help you could give
(quick update) It is clearly something wrong with my switching address spaces. any attempt to write cr3 during my interupt handler causes the GP. it should be noted though that both address spaces are valid, i can switch to them in other parts of my kernel just fine.
i create a TSS for the system to use, install it into my gdt, load it with ltr.
I set the PIC to gennerate intterupts at regular intervals and have a handler that catches them, pushes registers,saves stack position, checks to see if it should switch task, loads cr3,loads stack position,pops registers then ireturns.
i create two processes, fill in some generic details for both, and create a page directory for both.
I switch into each processes Address space and setup there stack to mimmic a interrupt followed by pushad instructions.
At this point i can switch to either processes address space, set esp to there stack pointer, do popad followed by iret and that process will start running. That works fine.
The problem occurs when i try to switch processes after a timer event. It works when no process switch occurs. ie i decrement the timeslice. But as soon as i try to switch stack and switch address space to switch task i get a General Protection Fault(0) exception.
I run my os in bochs and am happy to post what ever code or output people would find helpfull. I would really appreciate any help you could give
(quick update) It is clearly something wrong with my switching address spaces. any attempt to write cr3 during my interupt handler causes the GP. it should be noted though that both address spaces are valid, i can switch to them in other parts of my kernel just fine.
I think it's time to put your debugging shoes on. The fact that you have updated so quickly shows me you are currently debugging this problem.
1. I will help tomorrow if it is not sorted.
2. Please read the sticky post about 'first time posters', especially r.e. forum topics such as 'PLEASE HELP'.
JamesM
1. I will help tomorrow if it is not sorted.
2. Please read the sticky post about 'first time posters', especially r.e. forum topics such as 'PLEASE HELP'.
JamesM
-
- Posts: 22
- Joined: Mon Oct 29, 2007 10:49 am
Hey, thanks for the speedy replies, this problem is really annoying. I think im going to have to post some my code. Bellow is my first (and most simple attempt) to get the context switching to work. i have tried modifying it in several ways which produced diffrent results, but since none of them worked i think its better i just give you the most simple stuff.
This is a shortend copy of my code that sets up a new process, it creates the address space for the process, switches into it, then sets up the stack to mimic an interrupted process, updates fields in the processes structure, then switches back to the kernel address space.
i do this twice to create two processes one called testprocess, the other called idleprocess, both just sit in a loop printing to the screen. i can set either one to be the CurrentProcess then call the following and it will jump into it and start running, no problems so far.
now i need to do a context switch, the following asm is called directly apon a timer interrupt.
and this is the task handler code called by the above asm.
Very embarisingly the task switching code does nothing. It prints TICK TOCK and the process names as it should, but it never actually switches task. Various attempts have been made to fix it, some of which genneratd a genneral protection fault 0 exception, i expect the problem is in the asm cos thats what im worst at . Pls could somebody unscramble this mess
This is a shortend copy of my code that sets up a new process, it creates the address space for the process, switches into it, then sets up the stack to mimic an interrupted process, updates fields in the processes structure, then switches back to the kernel address space.
Code: Select all
void processInitializeProcess(unsigned long entrypoint,char*name,ProcessStructure* p)
{
p->ID = 0;
p->ProcessName = name;
p->Priority = 5;
p->State = 0;
p->PageDirectoryAddress = createPageDirectory();
unsigned long oldpd = read_cr3();
write_cr3(p->PageDirectoryAddress);
unsigned long* stacksetup = (unsigned long*)((1022 << 22)+( 1022<< 12));
//mimic interupt stackframe
stacksetup--;
*stacksetup=0x0202;
stacksetup--;
*stacksetup=0x08;
stacksetup--;
*stacksetup=entrypoint;
//Mimics pushad
stacksetup--;
*stacksetup = 0;
stacksetup--;
*stacksetup = 0;
stacksetup--;
*stacksetup = 0;
stacksetup--;
*stacksetup = 0;
stacksetup--;
*stacksetup = 0;
stacksetup--;
*stacksetup = 0;
stacksetup--;
*stacksetup = 0;
stacksetup--;
*stacksetup = 0;
p->StackPointer = (unsigned long) stacksetup;
write_cr3(oldpd);
return;
}
i do this twice to create two processes one called testprocess, the other called idleprocess, both just sit in a loop printing to the screen. i can set either one to be the CurrentProcess then call the following and it will jump into it and start running, no problems so far.
Code: Select all
void StartMultiTasking()
{
CurrentProcessPageDirectory = CurrentProcess->PageDirectoryAddress;
write_cr3(CurrentProcess->PageDirectoryAddress);
asm ("movl %0, %%esp\n\t" : :"r"(CurrentProcess->StackPointer));
asm("popa\n\t");
asm("iret\n\t");
}
Code: Select all
timer_stub:
cli
pushad
mov eax, esp
mov [CurrentProcessStack], eax
mov eax, cr3
mov [CurrentProcessPageDirectory], eax
call TaskHandler
mov eax,[CurrentProcessPageDirectory]
mov cr3,eax
mov esp,[CurrentProcessStack]
mov al,0x20
out 0x20,al
popad
sti
iret
Code: Select all
ProcessStructure* CurrentProcess;
unsigned int TimeSlice = 55;
static unsigned long CurrentStack;
static unsigned long CurrentPD;
unsigned long* CurrentProcessPageDirectory = & CurrentPD;
unsigned long* CurrentProcessStack = &CurrentStack;
void TaskHandler()
{
TimeSlice--;
//VGADriver::print("TICK");
if(TimeSlice > 0) return;
//VGADriver::print("TOCK");
CurrentProcess->PageDirectoryAddress = CurrentPD;
CurrentProcess->UStackPointer = CurrentStack;
CurrentProcess = CurrentProcess->nextProcess;
TimeSlice = 55;
//Check to see if current process is set to be the next process(which it is)
VGADriver::print(CurrentProcess->ProcessName);
CurrentPD = CurrentProcess->PageDirectoryAddress;
CurrentStack = CurrentProcess->UStackPointer;
return;
}
- Combuster
- Member
- Posts: 9301
- Joined: Wed Oct 18, 2006 3:45 am
- Libera.chat IRC: [com]buster
- Location: On the balcony, where I can actually keep 1½m distance
- Contact:
It's your C code I'm concerned about.I expect the problem is in the asm cos thats what im worst at
You declare a static variable (which means the scope is limited to the compilation unit), then add a global reference to that. This is just bad coding practice. The actual error is however a few lines further:
Code: Select all
unsigned long* CurrentProcessPageDirectory = & CurrentPD;
unsigned long* CurrentProcessStack = &CurrentStack;
The bug is hidden in the fact that your ASM changes CurrentProcessPageDirectory and CurrentProcessStack. The C code changes CurrentPD and CurrentStack. since these are distinct variables, neither influences the other. CurrentProcessPageDirectory will just not point to currentPD after the first timer call.
What you need to do is use CurrentStack and CurrentPD in both C and ASM, and declare them as unsigned long CurrentXX;
-
- Posts: 22
- Joined: Mon Oct 29, 2007 10:49 am
Thanks for the reply combuster i have taken what you said on board and removed the static key word for both CurrentStack and CurrentPD for the sake of good practice . However, the following im not so sure about
I know im no expert, it could well be that my code is doing something other than intended.
you are certanly correct in your assertion that a PD is a table not a number, however the CurrentPD variable intends only to store the physicaladdress of the PD in the underlying memory,which is just a number, so that it can be loaded into cr3.This is a pointer to a long - strangely enough a page directory isn't a number, but a 4k table. more appropriately this would be a void * pointer.
allas that is wat i was intending to do, however it doesn't work because mov doesn't allow it. Mov only excepts an immdediate value ,register or a memory address so any attempt to use CurrentPD or CurrentStack directly yields an invalid opcode opperand error on compilation. I therefor created the two pointers CurrentProcessPageDirectory and CurrentProcessStack to point to them. The asm code isn't meant to alter the values of the pointers, its meant to alter the value of what there pointing at.What you need to do is use CurrentStack and CurrentPD in both C and ASM, and declare them as unsigned long CurrentXX;
I know im no expert, it could well be that my code is doing something other than intended.
- Combuster
- Member
- Posts: 9301
- Joined: Wed Oct 18, 2006 3:45 am
- Libera.chat IRC: [com]buster
- Location: On the balcony, where I can actually keep 1½m distance
- Contact:
Bull. CurrentPD etc represents an immediate value. Therefore this piece of code compiles as expected. (I verified that)PinkyNoBrain wrote:so any attempt to use CurrentPD or CurrentStack directly yields an invalid opcode opperand error on compilation.
Code: Select all
extern CurrentStack
extern CurrentPD
extern TaskHandler
timer_stub:
cli
pushad
mov eax, esp
mov [CurrentStack], eax
mov eax, cr3
mov [CurrentPD], eax
call TaskHandler
mov eax,[CurrentPD]
mov cr3,eax
mov esp,[CurrentStack]
mov al,0x20
out 0x20,al
popad
sti
iret
-
- Posts: 22
- Joined: Mon Oct 29, 2007 10:49 am
Ok combuster i take it all back, the above code does compile fine, i must of done something stupid when i tried it before. Unfortunatley my os still doesn't work, i now get a General Protection fault when the task switch should occur, the GP definetly occurs on the IRET. I have been reading the manuals trying to work out why but i cant figure out why it works fine if i just jump straight into it form the kernel. but doesn't work when i try to switch to it from another task. Could it perhaps be to do with updating or reseting fields in the TSS. or is my stack building code wrong for a task switch once multitasking has started( all the processes run at the same priviledge level so i dont know why this would be). Id appreciate any clues anyone could give .
[Edited] AHA, breakthrough, the last thing in my bochs outputfile is "IRET: return CS selector null" . Ok, so now i know what the problems is but still cant get how to fix this. It must be that my stack is getting out of alignment somehow, but wats causing it given that it works straight from the kernel.
[Edited] AHA, breakthrough, the last thing in my bochs outputfile is "IRET: return CS selector null" . Ok, so now i know what the problems is but still cant get how to fix this. It must be that my stack is getting out of alignment somehow, but wats causing it given that it works straight from the kernel.
actually, it isnt necessary there either -- just commonCombuster wrote:Given your previous code, it appears that you do not update tss.esp0 as part of the task switch. While this isn't necessary for kernel threads, once in userland it must match the top of the stack or you will end up overwriting memory you used for other purposes.
it really depends on your design, i know at least 2 of us use designs which do not require altering TSS.ESP0 at all
however, for most people, you are correct, ESP0 will need to be changed on each task-switch
-
- Posts: 22
- Joined: Mon Oct 29, 2007 10:49 am
Ok guys, thanks for all the help. Ive almost got it all working now, it was a stupid bug in my C code that im to embarrised to point out, but Kernel thread task switching now works perfectly. I am however still getting a GP fault when trying to start a user mode thread. I did wat was suggested above and updated the TSS on task switches, i also know you need to have the esp and ss on the stack for a switch to userspace. I am slightly unclear as to which way round, shoudld it be IP CS EFLAGS ESP SS or IP CS EFLAGS SS ESP, two tutorials ive read dissagree and the intel manual is a little unclear, so i just want to be sure. Also, is another possibility for me getting a GP fault when i iret into the task for the first time because the task is staticaly linked into the kernel and i i map the kernel into my pagedirectory with system rather than user access flags. Would this cause a GP fault or a Page fault, hmm? Just to be clear on the changes ill repost my modifyed code..
Initializing the processes stack is now.....
Starting the multitasking(where i get the error)
modifyied TaskHandler(we never get hear)
Initializing the processes stack is now.....
Code: Select all
//Used when creating a process, function replicates the stack from that would have resulted from a Timer Interrupt.
void processInitializeProcess(unsigned long entrypoint,char*name,ProcessStructure* p)
{
p->ID = 0;
p->ProcessName = name;
p->Priority = 5;
p->State = 0;
p->PageDirectoryAddress = createPageDirectory();
unsigned long oldpd = read_cr3();
write_cr3(p->PageDirectoryAddress);
unsigned long* stacksetup = (unsigned long*)((1022 << 22)+( 1022<< 12));
//mimic interupt stackframe
stacksetup--;
*stacksetup = 0x20;//ss
stacksetup--;
*stacksetup = ((1022 << 22)+( 510<< 12));//esp
stacksetup--;
*stacksetup=0x0202; //eflags
stacksetup--;
*stacksetup= 0x18;//a user level CS, 0x08 was ring 0, put it back in and the whole thing works
stacksetup--;
*stacksetup=entrypoint;//ip
//Mimics pushad
stacksetup--;
*stacksetup = 0;
stacksetup--;
*stacksetup = 0;
stacksetup--;
*stacksetup = 0;
stacksetup--;
*stacksetup = 0;
stacksetup--;
*stacksetup = 0;
stacksetup--;
*stacksetup = 0;
stacksetup--;
*stacksetup = 0;
stacksetup--;
*stacksetup = 0;
//push segments
stacksetup--;
*stacksetup=0x20;
stacksetup--;
*stacksetup=0x20;
stacksetup--;
*stacksetup=0x20;
stacksetup--;
*stacksetup=0x20;
p->KStackPointer = (unsigned long) stacksetup;
write_cr3(oldpd);
return;
}
Code: Select all
void StartMultiTasking()
{
write_cr3(CurrentProcess->PageDirectoryAddress);
asm ("movl %0, %%esp\n\t" : :"r"(CurrentProcess->KStackPointer));
asm("pop %ds\n\t");
asm("pop %es\n\t");
asm("pop %fs\n\t");
asm("pop %gs\n\t");
asm("popa\n\t");
asm("iret\n\t");
}
Code: Select all
void TaskHandler()
{
TimeSlice--;
//VGADriver::print("Tick");
if(TimeSlice > 0) return;
//VGADriver::print("Tock");
CurrentProcess->PageDirectoryAddress = CurrentPD;
CurrentProcess->KStackPointer = CurrentStack;
CurrentProcess = CurrentProcess->nextProcess;
TimeSlice = 30;
SysTSS.esp0 = CurrentProcess->KStackPointer;
SysTSS.cr3 = CurrentProcess->PageDirectoryAddress;
CurrentPD = CurrentProcess->PageDirectoryAddress;
CurrentStack = CurrentProcess->KStackPointer;
return;
}
Well I'll just post the code I use to set up the kernel stack for my user mode code.
The stack will be pointed at esp when I issue the iret.
After looking at your code I think I see the problem. You are pushing the wrong cs and ss on the stack. The value you are pushing has the lower 2 bits clear. See the values I am pushing on the stack? Try setting the lower 2 bits and see if that works.
Code: Select all
*esp-- = 32 + 3; // user ss
*esp-- = 0x50000000; // user esp
*esp-- = 0x202; // user eflags
*esp-- = 24 + 3; // user cs
*esp = 0x40000000; // user eip
After looking at your code I think I see the problem. You are pushing the wrong cs and ss on the stack. The value you are pushing has the lower 2 bits clear. See the values I am pushing on the stack? Try setting the lower 2 bits and see if that works.
segment selectors in protected mode are offsets into the gdt and therefore are always aligned on an 8 byte boundary. That means the offset's last 3 bits are 0 and unused. These 3 bits store information about the current privilege level. The privilege level of segment 0x20 in the gdt might be 3 but the segment selector 0x20 will still run in kernel mode. Because this has the same privilege level as the kernel the esp and ss are not popped off because in order for ss and esp to be popped, cs must be at a different privilege level.
The last 2 bits of the segment selector represent the current privilege level and the 3rd bit is the table the selector is in(the gdt or ldt). Therefore for your segment selectors to truly be in userspace they should be 0x1b for cx and 0x23 for ss and the general purpose segment selectors. This will change the cpl to 3.
And the order is eip cs eflags esp ss
The last 2 bits of the segment selector represent the current privilege level and the 3rd bit is the table the selector is in(the gdt or ldt). Therefore for your segment selectors to truly be in userspace they should be 0x1b for cx and 0x23 for ss and the general purpose segment selectors. This will change the cpl to 3.
And the order is eip cs eflags esp ss
-
- Posts: 22
- Joined: Mon Oct 29, 2007 10:49 am