Page 1 of 1

Please Help!! Software Multitasking

Posted: Mon Oct 29, 2007 11:37 am
by PinkyNoBrain
Hi, i have a working protected mode kernel with a memory mannager(uses paging). I have been trying to implement software multitasking.

i create a TSS for the system to use, install it into my gdt, load it with ltr.

I set the PIC to gennerate intterupts at regular intervals and have a handler that catches them, pushes registers,saves stack position, checks to see if it should switch task, loads cr3,loads stack position,pops registers then ireturns.

i create two processes, fill in some generic details for both, and create a page directory for both.

I switch into each processes Address space and setup there stack to mimmic a interrupt followed by pushad instructions.

At this point i can switch to either processes address space, set esp to there stack pointer, do popad followed by iret and that process will start running. That works fine.

The problem occurs when i try to switch processes after a timer event. It works when no process switch occurs. ie i decrement the timeslice. But as soon as i try to switch stack and switch address space to switch task i get a General Protection Fault(0) exception.

I run my os in bochs and am happy to post what ever code or output people would find helpfull. I would really appreciate any help you could give :-)


(quick update) It is clearly something wrong with my switching address spaces. any attempt to write cr3 during my interupt handler causes the GP. it should be noted though that both address spaces are valid, i can switch to them in other parts of my kernel just fine.

Posted: Mon Oct 29, 2007 11:58 am
by JamesM
I think it's time to put your debugging shoes on. The fact that you have updated so quickly shows me you are currently debugging this problem.

1. I will help tomorrow if it is not sorted.
2. Please read the sticky post about 'first time posters', especially r.e. forum topics such as 'PLEASE HELP'.

JamesM

Posted: Mon Oct 29, 2007 12:12 pm
by TomTom
Are you sure that you update cr3 with a physical address, not a virtual one?

Posted: Wed Oct 31, 2007 11:54 am
by PinkyNoBrain
Hey, thanks for the speedy replies, this problem is really annoying. I think im going to have to post some my code. Bellow is my first (and most simple attempt) to get the context switching to work. i have tried modifying it in several ways which produced diffrent results, but since none of them worked i think its better i just give you the most simple stuff.

This is a shortend copy of my code that sets up a new process, it creates the address space for the process, switches into it, then sets up the stack to mimic an interrupted process, updates fields in the processes structure, then switches back to the kernel address space.

Code: Select all

void processInitializeProcess(unsigned long entrypoint,char*name,ProcessStructure* p)
{
	p->ID = 0;
	p->ProcessName = name;
	p->Priority = 5;
	p->State = 0;

	p->PageDirectoryAddress = createPageDirectory();

	unsigned long oldpd = read_cr3();
	write_cr3(p->PageDirectoryAddress);
	
	unsigned long* stacksetup = (unsigned long*)((1022 << 22)+( 1022<< 12));
	//mimic interupt stackframe
	stacksetup--;
        *stacksetup=0x0202;
	stacksetup--;
        *stacksetup=0x08;
	stacksetup--;
	*stacksetup=entrypoint;

	//Mimics pushad
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;

	p->StackPointer = (unsigned long) stacksetup;
	write_cr3(oldpd);
	return;
}




i do this twice to create two processes one called testprocess, the other called idleprocess, both just sit in a loop printing to the screen. i can set either one to be the CurrentProcess then call the following and it will jump into it and start running, no problems so far.

Code: Select all

void StartMultiTasking()
{
	CurrentProcessPageDirectory = CurrentProcess->PageDirectoryAddress;
	write_cr3(CurrentProcess->PageDirectoryAddress);
	asm ("movl %0, %%esp\n\t" : :"r"(CurrentProcess->StackPointer));

	asm("popa\n\t");
	asm("iret\n\t");
}
now i need to do a context switch, the following asm is called directly apon a timer interrupt.

Code: Select all

timer_stub:
        cli
	pushad
	mov eax, esp
	mov [CurrentProcessStack], eax
	mov eax, cr3
	mov [CurrentProcessPageDirectory], eax
	call TaskHandler
	mov eax,[CurrentProcessPageDirectory]
	mov cr3,eax
	mov esp,[CurrentProcessStack]
	mov al,0x20
        out 0x20,al
	popad
	sti
	iret
and this is the task handler code called by the above asm.

Code: Select all

ProcessStructure* CurrentProcess;
unsigned int TimeSlice = 55;

static unsigned long CurrentStack;
static unsigned long CurrentPD;

unsigned long* CurrentProcessPageDirectory = & CurrentPD;
unsigned long* CurrentProcessStack = &CurrentStack; 

void TaskHandler()
{
	TimeSlice--;
	//VGADriver::print("TICK");
	if(TimeSlice > 0) return;
	//VGADriver::print("TOCK");

	CurrentProcess->PageDirectoryAddress = CurrentPD;
	CurrentProcess->UStackPointer = CurrentStack;

	CurrentProcess = CurrentProcess->nextProcess;
	TimeSlice = 55;

	//Check to see if current process is set to be the next process(which it is)
	VGADriver::print(CurrentProcess->ProcessName);

	CurrentPD = CurrentProcess->PageDirectoryAddress;
	CurrentStack = CurrentProcess->UStackPointer;
	return;
}
Very embarisingly the task switching code does nothing. It prints TICK TOCK and the process names as it should, but it never actually switches task. Various attempts have been made to fix it, some of which genneratd a genneral protection fault 0 exception, i expect the problem is in the asm cos thats what im worst at :-) . Pls could somebody unscramble this mess :-(

Posted: Wed Oct 31, 2007 3:14 pm
by Combuster
I expect the problem is in the asm cos thats what im worst at
It's your C code I'm concerned about. :wink:

You declare a static variable (which means the scope is limited to the compilation unit), then add a global reference to that. This is just bad coding practice. The actual error is however a few lines further:

Code: Select all

unsigned long* CurrentProcessPageDirectory = & CurrentPD;
unsigned long* CurrentProcessStack = &CurrentStack; 
This is a pointer to a long - strangely enough a page directory isn't a number, but a 4k table. more appropriately this would be a void * pointer.

The bug is hidden in the fact that your ASM changes CurrentProcessPageDirectory and CurrentProcessStack. The C code changes CurrentPD and CurrentStack. since these are distinct variables, neither influences the other. :!: CurrentProcessPageDirectory will just not point to currentPD after the first timer call.

What you need to do is use CurrentStack and CurrentPD in both C and ASM, and declare them as unsigned long CurrentXX;

Posted: Thu Nov 01, 2007 8:27 am
by PinkyNoBrain
Thanks for the reply combuster i have taken what you said on board and removed the static key word for both CurrentStack and CurrentPD for the sake of good practice :-). However, the following im not so sure about
This is a pointer to a long - strangely enough a page directory isn't a number, but a 4k table. more appropriately this would be a void * pointer.
you are certanly correct in your assertion that a PD is a table not a number, however the CurrentPD variable intends only to store the physicaladdress of the PD in the underlying memory,which is just a number, so that it can be loaded into cr3.
What you need to do is use CurrentStack and CurrentPD in both C and ASM, and declare them as unsigned long CurrentXX;
allas that is wat i was intending to do, however it doesn't work because mov doesn't allow it. Mov only excepts an immdediate value ,register or a memory address so any attempt to use CurrentPD or CurrentStack directly yields an invalid opcode opperand error on compilation. I therefor created the two pointers CurrentProcessPageDirectory and CurrentProcessStack to point to them. The asm code isn't meant to alter the values of the pointers, its meant to alter the value of what there pointing at.

I know im no expert, it could well be that my code is doing something other than intended.

Posted: Thu Nov 01, 2007 12:20 pm
by Combuster
PinkyNoBrain wrote:so any attempt to use CurrentPD or CurrentStack directly yields an invalid opcode opperand error on compilation.
Bull. CurrentPD etc represents an immediate value. Therefore this piece of code compiles as expected. (I verified that)

Code: Select all

extern CurrentStack
extern CurrentPD
extern TaskHandler

timer_stub:
   cli
   pushad
   mov eax, esp
   mov [CurrentStack], eax
   mov eax, cr3
   mov [CurrentPD], eax 
   call TaskHandler
   mov eax,[CurrentPD]
   mov cr3,eax
   mov esp,[CurrentStack]
   mov al,0x20
   out 0x20,al
   popad
   sti
   iret 

Posted: Thu Nov 01, 2007 3:34 pm
by PinkyNoBrain
Ok combuster i take it all back, the above code does compile fine, i must of done something stupid when i tried it before. Unfortunatley my os still doesn't work, i now get a General Protection fault when the task switch should occur, the GP definetly occurs on the IRET. I have been reading the manuals trying to work out why but i cant figure out why it works fine if i just jump straight into it form the kernel. but doesn't work when i try to switch to it from another task. Could it perhaps be to do with updating or reseting fields in the TSS. or is my stack building code wrong for a task switch once multitasking has started( all the processes run at the same priviledge level so i dont know why this would be). Id appreciate any clues anyone could give :-).

[Edited] AHA, breakthrough, the last thing in my bochs outputfile is "IRET: return CS selector null" . Ok, so now i know what the problems is but still cant get how to fix this. It must be that my stack is getting out of alignment somehow, but wats causing it given that it works straight from the kernel.

Posted: Thu Nov 01, 2007 5:10 pm
by Combuster
Given your previous code, it appears that you do not update tss.esp0 as part of the task switch. While this isn't necessary for kernel threads, once in userland it must match the top of the stack or you will end up overwriting memory you used for other purposes.

Posted: Fri Nov 02, 2007 2:01 pm
by JAAman
Combuster wrote:Given your previous code, it appears that you do not update tss.esp0 as part of the task switch. While this isn't necessary for kernel threads, once in userland it must match the top of the stack or you will end up overwriting memory you used for other purposes.
actually, it isnt necessary there either -- just common

it really depends on your design, i know at least 2 of us use designs which do not require altering TSS.ESP0 at all

however, for most people, you are correct, ESP0 will need to be changed on each task-switch

Posted: Sat Nov 03, 2007 10:48 am
by PinkyNoBrain
Ok guys, thanks for all the help. Ive almost got it all working now, it was a stupid bug in my C code that im to embarrised to point out, but Kernel thread task switching now works perfectly. I am however still getting a GP fault when trying to start a user mode thread. I did wat was suggested above and updated the TSS on task switches, i also know you need to have the esp and ss on the stack for a switch to userspace. I am slightly unclear as to which way round, shoudld it be IP CS EFLAGS ESP SS or IP CS EFLAGS SS ESP, two tutorials ive read dissagree and the intel manual is a little unclear, so i just want to be sure. Also, is another possibility for me getting a GP fault when i iret into the task for the first time because the task is staticaly linked into the kernel and i i map the kernel into my pagedirectory with system rather than user access flags. Would this cause a GP fault or a Page fault, hmm? Just to be clear on the changes ill repost my modifyed code..

Initializing the processes stack is now.....

Code: Select all

//Used when creating a process, function replicates the stack from that would have resulted from a Timer Interrupt.
void processInitializeProcess(unsigned long entrypoint,char*name,ProcessStructure* p)
{
	p->ID = 0;
	p->ProcessName = name;
	p->Priority = 5;
	p->State = 0;

	p->PageDirectoryAddress = createPageDirectory();

	unsigned long oldpd = read_cr3();
	write_cr3(p->PageDirectoryAddress);
	
	unsigned long* stacksetup = (unsigned long*)((1022 << 22)+( 1022<< 12));
	//mimic interupt stackframe
	stacksetup--;
	*stacksetup = 0x20;//ss
	stacksetup--;
	*stacksetup = ((1022 << 22)+( 510<< 12));//esp
	stacksetup--;
        *stacksetup=0x0202; //eflags
	stacksetup--;
        *stacksetup= 0x18;//a user level CS, 0x08 was ring 0, put it back in and the whole thing works
	stacksetup--;
	*stacksetup=entrypoint;//ip

	//Mimics pushad
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;

	//push segments
	stacksetup--;
	*stacksetup=0x20;
	stacksetup--;
	*stacksetup=0x20;
	stacksetup--;
	*stacksetup=0x20;
	stacksetup--;
	*stacksetup=0x20;
	
	p->KStackPointer = (unsigned long) stacksetup;

	write_cr3(oldpd);
	return;
}
Starting the multitasking(where i get the error)

Code: Select all

void StartMultiTasking()
{
	write_cr3(CurrentProcess->PageDirectoryAddress);
	asm ("movl %0, %%esp\n\t" : :"r"(CurrentProcess->KStackPointer));

	asm("pop %ds\n\t");
	asm("pop %es\n\t");
	asm("pop %fs\n\t");
	asm("pop %gs\n\t");

	asm("popa\n\t");
	asm("iret\n\t");
}
modifyied TaskHandler(we never get hear)

Code: Select all

void TaskHandler()
{
	TimeSlice--;
	//VGADriver::print("Tick");
	if(TimeSlice > 0) return;
	//VGADriver::print("Tock");

	CurrentProcess->PageDirectoryAddress = CurrentPD;
	CurrentProcess->KStackPointer = CurrentStack;

	CurrentProcess = CurrentProcess->nextProcess;

	TimeSlice = 30;
	
	SysTSS.esp0 = CurrentProcess->KStackPointer;
	SysTSS.cr3 = CurrentProcess->PageDirectoryAddress;

	CurrentPD = CurrentProcess->PageDirectoryAddress;
	CurrentStack = CurrentProcess->KStackPointer;
	return;
}

Posted: Sat Nov 03, 2007 2:51 pm
by frank
Well I'll just post the code I use to set up the kernel stack for my user mode code.

Code: Select all

*esp-- = 32 + 3;		// user ss
*esp-- = 0x50000000;		// user esp
*esp-- = 0x202;		// user eflags
*esp-- = 24 + 3;		// user cs
*esp = 0x40000000;		// user eip
The stack will be pointed at esp when I issue the iret.

After looking at your code I think I see the problem. You are pushing the wrong cs and ss on the stack. The value you are pushing has the lower 2 bits clear. See the values I am pushing on the stack? Try setting the lower 2 bits and see if that works.

Posted: Sat Nov 03, 2007 6:58 pm
by iammisc
segment selectors in protected mode are offsets into the gdt and therefore are always aligned on an 8 byte boundary. That means the offset's last 3 bits are 0 and unused. These 3 bits store information about the current privilege level. The privilege level of segment 0x20 in the gdt might be 3 but the segment selector 0x20 will still run in kernel mode. Because this has the same privilege level as the kernel the esp and ss are not popped off because in order for ss and esp to be popped, cs must be at a different privilege level.

The last 2 bits of the segment selector represent the current privilege level and the 3rd bit is the table the selector is in(the gdt or ldt). Therefore for your segment selectors to truly be in userspace they should be 0x1b for cx and 0x23 for ss and the general purpose segment selectors. This will change the cpl to 3.

And the order is eip cs eflags esp ss

Posted: Sun Nov 04, 2007 12:15 pm
by PinkyNoBrain
Good advice guys, must have missed that bit in the manuals. i have changed cs to 0x1b ans ss to 0x23 as suggested. I am still getting the GP fault though, its very annoying.

Posted: Sun Nov 04, 2007 2:04 pm
by AJ
Hi,

Can we see the Bochs register dump following your GP fault? I think that would help a lot.

Cheers,
Adam