Please Help!! Software Multitasking

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
PinkyNoBrain
Posts: 22
Joined: Mon Oct 29, 2007 10:49 am

Please Help!! Software Multitasking

Post by PinkyNoBrain »

Hi, i have a working protected mode kernel with a memory mannager(uses paging). I have been trying to implement software multitasking.

i create a TSS for the system to use, install it into my gdt, load it with ltr.

I set the PIC to gennerate intterupts at regular intervals and have a handler that catches them, pushes registers,saves stack position, checks to see if it should switch task, loads cr3,loads stack position,pops registers then ireturns.

i create two processes, fill in some generic details for both, and create a page directory for both.

I switch into each processes Address space and setup there stack to mimmic a interrupt followed by pushad instructions.

At this point i can switch to either processes address space, set esp to there stack pointer, do popad followed by iret and that process will start running. That works fine.

The problem occurs when i try to switch processes after a timer event. It works when no process switch occurs. ie i decrement the timeslice. But as soon as i try to switch stack and switch address space to switch task i get a General Protection Fault(0) exception.

I run my os in bochs and am happy to post what ever code or output people would find helpfull. I would really appreciate any help you could give :-)


(quick update) It is clearly something wrong with my switching address spaces. any attempt to write cr3 during my interupt handler causes the GP. it should be noted though that both address spaces are valid, i can switch to them in other parts of my kernel just fine.
User avatar
JamesM
Member
Member
Posts: 2935
Joined: Tue Jul 10, 2007 5:27 am
Location: York, United Kingdom
Contact:

Post by JamesM »

I think it's time to put your debugging shoes on. The fact that you have updated so quickly shows me you are currently debugging this problem.

1. I will help tomorrow if it is not sorted.
2. Please read the sticky post about 'first time posters', especially r.e. forum topics such as 'PLEASE HELP'.

JamesM
TomTom
Posts: 23
Joined: Tue May 01, 2007 2:03 am
Location: USA

Post by TomTom »

Are you sure that you update cr3 with a physical address, not a virtual one?
PinkyNoBrain
Posts: 22
Joined: Mon Oct 29, 2007 10:49 am

Post by PinkyNoBrain »

Hey, thanks for the speedy replies, this problem is really annoying. I think im going to have to post some my code. Bellow is my first (and most simple attempt) to get the context switching to work. i have tried modifying it in several ways which produced diffrent results, but since none of them worked i think its better i just give you the most simple stuff.

This is a shortend copy of my code that sets up a new process, it creates the address space for the process, switches into it, then sets up the stack to mimic an interrupted process, updates fields in the processes structure, then switches back to the kernel address space.

Code: Select all

void processInitializeProcess(unsigned long entrypoint,char*name,ProcessStructure* p)
{
	p->ID = 0;
	p->ProcessName = name;
	p->Priority = 5;
	p->State = 0;

	p->PageDirectoryAddress = createPageDirectory();

	unsigned long oldpd = read_cr3();
	write_cr3(p->PageDirectoryAddress);
	
	unsigned long* stacksetup = (unsigned long*)((1022 << 22)+( 1022<< 12));
	//mimic interupt stackframe
	stacksetup--;
        *stacksetup=0x0202;
	stacksetup--;
        *stacksetup=0x08;
	stacksetup--;
	*stacksetup=entrypoint;

	//Mimics pushad
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;

	p->StackPointer = (unsigned long) stacksetup;
	write_cr3(oldpd);
	return;
}




i do this twice to create two processes one called testprocess, the other called idleprocess, both just sit in a loop printing to the screen. i can set either one to be the CurrentProcess then call the following and it will jump into it and start running, no problems so far.

Code: Select all

void StartMultiTasking()
{
	CurrentProcessPageDirectory = CurrentProcess->PageDirectoryAddress;
	write_cr3(CurrentProcess->PageDirectoryAddress);
	asm ("movl %0, %%esp\n\t" : :"r"(CurrentProcess->StackPointer));

	asm("popa\n\t");
	asm("iret\n\t");
}
now i need to do a context switch, the following asm is called directly apon a timer interrupt.

Code: Select all

timer_stub:
        cli
	pushad
	mov eax, esp
	mov [CurrentProcessStack], eax
	mov eax, cr3
	mov [CurrentProcessPageDirectory], eax
	call TaskHandler
	mov eax,[CurrentProcessPageDirectory]
	mov cr3,eax
	mov esp,[CurrentProcessStack]
	mov al,0x20
        out 0x20,al
	popad
	sti
	iret
and this is the task handler code called by the above asm.

Code: Select all

ProcessStructure* CurrentProcess;
unsigned int TimeSlice = 55;

static unsigned long CurrentStack;
static unsigned long CurrentPD;

unsigned long* CurrentProcessPageDirectory = & CurrentPD;
unsigned long* CurrentProcessStack = &CurrentStack; 

void TaskHandler()
{
	TimeSlice--;
	//VGADriver::print("TICK");
	if(TimeSlice > 0) return;
	//VGADriver::print("TOCK");

	CurrentProcess->PageDirectoryAddress = CurrentPD;
	CurrentProcess->UStackPointer = CurrentStack;

	CurrentProcess = CurrentProcess->nextProcess;
	TimeSlice = 55;

	//Check to see if current process is set to be the next process(which it is)
	VGADriver::print(CurrentProcess->ProcessName);

	CurrentPD = CurrentProcess->PageDirectoryAddress;
	CurrentStack = CurrentProcess->UStackPointer;
	return;
}
Very embarisingly the task switching code does nothing. It prints TICK TOCK and the process names as it should, but it never actually switches task. Various attempts have been made to fix it, some of which genneratd a genneral protection fault 0 exception, i expect the problem is in the asm cos thats what im worst at :-) . Pls could somebody unscramble this mess :-(
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Post by Combuster »

I expect the problem is in the asm cos thats what im worst at
It's your C code I'm concerned about. :wink:

You declare a static variable (which means the scope is limited to the compilation unit), then add a global reference to that. This is just bad coding practice. The actual error is however a few lines further:

Code: Select all

unsigned long* CurrentProcessPageDirectory = & CurrentPD;
unsigned long* CurrentProcessStack = &CurrentStack; 
This is a pointer to a long - strangely enough a page directory isn't a number, but a 4k table. more appropriately this would be a void * pointer.

The bug is hidden in the fact that your ASM changes CurrentProcessPageDirectory and CurrentProcessStack. The C code changes CurrentPD and CurrentStack. since these are distinct variables, neither influences the other. :!: CurrentProcessPageDirectory will just not point to currentPD after the first timer call.

What you need to do is use CurrentStack and CurrentPD in both C and ASM, and declare them as unsigned long CurrentXX;
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
PinkyNoBrain
Posts: 22
Joined: Mon Oct 29, 2007 10:49 am

Post by PinkyNoBrain »

Thanks for the reply combuster i have taken what you said on board and removed the static key word for both CurrentStack and CurrentPD for the sake of good practice :-). However, the following im not so sure about
This is a pointer to a long - strangely enough a page directory isn't a number, but a 4k table. more appropriately this would be a void * pointer.
you are certanly correct in your assertion that a PD is a table not a number, however the CurrentPD variable intends only to store the physicaladdress of the PD in the underlying memory,which is just a number, so that it can be loaded into cr3.
What you need to do is use CurrentStack and CurrentPD in both C and ASM, and declare them as unsigned long CurrentXX;
allas that is wat i was intending to do, however it doesn't work because mov doesn't allow it. Mov only excepts an immdediate value ,register or a memory address so any attempt to use CurrentPD or CurrentStack directly yields an invalid opcode opperand error on compilation. I therefor created the two pointers CurrentProcessPageDirectory and CurrentProcessStack to point to them. The asm code isn't meant to alter the values of the pointers, its meant to alter the value of what there pointing at.

I know im no expert, it could well be that my code is doing something other than intended.
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Post by Combuster »

PinkyNoBrain wrote:so any attempt to use CurrentPD or CurrentStack directly yields an invalid opcode opperand error on compilation.
Bull. CurrentPD etc represents an immediate value. Therefore this piece of code compiles as expected. (I verified that)

Code: Select all

extern CurrentStack
extern CurrentPD
extern TaskHandler

timer_stub:
   cli
   pushad
   mov eax, esp
   mov [CurrentStack], eax
   mov eax, cr3
   mov [CurrentPD], eax 
   call TaskHandler
   mov eax,[CurrentPD]
   mov cr3,eax
   mov esp,[CurrentStack]
   mov al,0x20
   out 0x20,al
   popad
   sti
   iret 
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
PinkyNoBrain
Posts: 22
Joined: Mon Oct 29, 2007 10:49 am

Post by PinkyNoBrain »

Ok combuster i take it all back, the above code does compile fine, i must of done something stupid when i tried it before. Unfortunatley my os still doesn't work, i now get a General Protection fault when the task switch should occur, the GP definetly occurs on the IRET. I have been reading the manuals trying to work out why but i cant figure out why it works fine if i just jump straight into it form the kernel. but doesn't work when i try to switch to it from another task. Could it perhaps be to do with updating or reseting fields in the TSS. or is my stack building code wrong for a task switch once multitasking has started( all the processes run at the same priviledge level so i dont know why this would be). Id appreciate any clues anyone could give :-).

[Edited] AHA, breakthrough, the last thing in my bochs outputfile is "IRET: return CS selector null" . Ok, so now i know what the problems is but still cant get how to fix this. It must be that my stack is getting out of alignment somehow, but wats causing it given that it works straight from the kernel.
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Post by Combuster »

Given your previous code, it appears that you do not update tss.esp0 as part of the task switch. While this isn't necessary for kernel threads, once in userland it must match the top of the stack or you will end up overwriting memory you used for other purposes.
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
User avatar
JAAman
Member
Member
Posts: 879
Joined: Wed Oct 27, 2004 11:00 pm
Location: WA

Post by JAAman »

Combuster wrote:Given your previous code, it appears that you do not update tss.esp0 as part of the task switch. While this isn't necessary for kernel threads, once in userland it must match the top of the stack or you will end up overwriting memory you used for other purposes.
actually, it isnt necessary there either -- just common

it really depends on your design, i know at least 2 of us use designs which do not require altering TSS.ESP0 at all

however, for most people, you are correct, ESP0 will need to be changed on each task-switch
PinkyNoBrain
Posts: 22
Joined: Mon Oct 29, 2007 10:49 am

Post by PinkyNoBrain »

Ok guys, thanks for all the help. Ive almost got it all working now, it was a stupid bug in my C code that im to embarrised to point out, but Kernel thread task switching now works perfectly. I am however still getting a GP fault when trying to start a user mode thread. I did wat was suggested above and updated the TSS on task switches, i also know you need to have the esp and ss on the stack for a switch to userspace. I am slightly unclear as to which way round, shoudld it be IP CS EFLAGS ESP SS or IP CS EFLAGS SS ESP, two tutorials ive read dissagree and the intel manual is a little unclear, so i just want to be sure. Also, is another possibility for me getting a GP fault when i iret into the task for the first time because the task is staticaly linked into the kernel and i i map the kernel into my pagedirectory with system rather than user access flags. Would this cause a GP fault or a Page fault, hmm? Just to be clear on the changes ill repost my modifyed code..

Initializing the processes stack is now.....

Code: Select all

//Used when creating a process, function replicates the stack from that would have resulted from a Timer Interrupt.
void processInitializeProcess(unsigned long entrypoint,char*name,ProcessStructure* p)
{
	p->ID = 0;
	p->ProcessName = name;
	p->Priority = 5;
	p->State = 0;

	p->PageDirectoryAddress = createPageDirectory();

	unsigned long oldpd = read_cr3();
	write_cr3(p->PageDirectoryAddress);
	
	unsigned long* stacksetup = (unsigned long*)((1022 << 22)+( 1022<< 12));
	//mimic interupt stackframe
	stacksetup--;
	*stacksetup = 0x20;//ss
	stacksetup--;
	*stacksetup = ((1022 << 22)+( 510<< 12));//esp
	stacksetup--;
        *stacksetup=0x0202; //eflags
	stacksetup--;
        *stacksetup= 0x18;//a user level CS, 0x08 was ring 0, put it back in and the whole thing works
	stacksetup--;
	*stacksetup=entrypoint;//ip

	//Mimics pushad
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;

	//push segments
	stacksetup--;
	*stacksetup=0x20;
	stacksetup--;
	*stacksetup=0x20;
	stacksetup--;
	*stacksetup=0x20;
	stacksetup--;
	*stacksetup=0x20;
	
	p->KStackPointer = (unsigned long) stacksetup;

	write_cr3(oldpd);
	return;
}
Starting the multitasking(where i get the error)

Code: Select all

void StartMultiTasking()
{
	write_cr3(CurrentProcess->PageDirectoryAddress);
	asm ("movl %0, %%esp\n\t" : :"r"(CurrentProcess->KStackPointer));

	asm("pop %ds\n\t");
	asm("pop %es\n\t");
	asm("pop %fs\n\t");
	asm("pop %gs\n\t");

	asm("popa\n\t");
	asm("iret\n\t");
}
modifyied TaskHandler(we never get hear)

Code: Select all

void TaskHandler()
{
	TimeSlice--;
	//VGADriver::print("Tick");
	if(TimeSlice > 0) return;
	//VGADriver::print("Tock");

	CurrentProcess->PageDirectoryAddress = CurrentPD;
	CurrentProcess->KStackPointer = CurrentStack;

	CurrentProcess = CurrentProcess->nextProcess;

	TimeSlice = 30;
	
	SysTSS.esp0 = CurrentProcess->KStackPointer;
	SysTSS.cr3 = CurrentProcess->PageDirectoryAddress;

	CurrentPD = CurrentProcess->PageDirectoryAddress;
	CurrentStack = CurrentProcess->KStackPointer;
	return;
}
frank
Member
Member
Posts: 729
Joined: Sat Dec 30, 2006 2:31 pm
Location: East Coast, USA

Post by frank »

Well I'll just post the code I use to set up the kernel stack for my user mode code.

Code: Select all

*esp-- = 32 + 3;		// user ss
*esp-- = 0x50000000;		// user esp
*esp-- = 0x202;		// user eflags
*esp-- = 24 + 3;		// user cs
*esp = 0x40000000;		// user eip
The stack will be pointed at esp when I issue the iret.

After looking at your code I think I see the problem. You are pushing the wrong cs and ss on the stack. The value you are pushing has the lower 2 bits clear. See the values I am pushing on the stack? Try setting the lower 2 bits and see if that works.
iammisc
Member
Member
Posts: 269
Joined: Thu Nov 09, 2006 6:23 pm

Post by iammisc »

segment selectors in protected mode are offsets into the gdt and therefore are always aligned on an 8 byte boundary. That means the offset's last 3 bits are 0 and unused. These 3 bits store information about the current privilege level. The privilege level of segment 0x20 in the gdt might be 3 but the segment selector 0x20 will still run in kernel mode. Because this has the same privilege level as the kernel the esp and ss are not popped off because in order for ss and esp to be popped, cs must be at a different privilege level.

The last 2 bits of the segment selector represent the current privilege level and the 3rd bit is the table the selector is in(the gdt or ldt). Therefore for your segment selectors to truly be in userspace they should be 0x1b for cx and 0x23 for ss and the general purpose segment selectors. This will change the cpl to 3.

And the order is eip cs eflags esp ss
PinkyNoBrain
Posts: 22
Joined: Mon Oct 29, 2007 10:49 am

Post by PinkyNoBrain »

Good advice guys, must have missed that bit in the manuals. i have changed cs to 0x1b ans ss to 0x23 as suggested. I am still getting the GP fault though, its very annoying.
User avatar
AJ
Member
Member
Posts: 2646
Joined: Sun Oct 22, 2006 7:01 am
Location: Devon, UK
Contact:

Post by AJ »

Hi,

Can we see the Bochs register dump following your GP fault? I think that would help a lot.

Cheers,
Adam
Post Reply