Multi Tasking: User level task return problem[SOLVED]

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
PinkyNoBrain
Posts: 22
Joined: Mon Oct 29, 2007 10:49 am

Multi Tasking: User level task return problem[SOLVED]

Post by PinkyNoBrain »

Hello fellow OS developers, i am having a rather persistent and frustrating problem getting multitasking working on my OS. My kernel currently has paged memory managment and interrupt handling. Im intending to do pre emptive, software based multitasking using the stack to store the processes current state. The following is the procedure i go through to start multitasking:

i create a TSS for the system to use, install it into my gdt, load it with ltr.

I set the PIC to gennerate intterupts at regular intervals and have a handler that catches them, pushes registers,saves stack position, checks to see if it should switch task, loads cr3,loads stack position,pops registers then ireturns.

i create two processes, fill in some generic details for both, and create a page directory for both.

I switch into each processes Address space and setup there stack to mimmic a interrupt followed by pushad instructions (Ie make the stack look like its been the result of a timer interrupt). The code i use for this is the following:

Code: Select all

//Used when creating a process, function replicates the stack from that would have resulted from a Timer Interrupt.
void processInitializeProcess(unsigned long entrypoint,char*name,ProcessStructure* p)
{
	unsigned long oldpd = read_cr3();
	write_cr3(p->PageDirectoryAddress);
	
	unsigned long* stacksetup = (unsigned long*)((1022 << 22)+( 1022<< 12));
	//mimic interupt stackframe
	stacksetup--;
	*stacksetup = 0x23;//ss
	stacksetup--;
	*stacksetup = ((1022 << 22)+( 510<< 12));//esp
	stacksetup--;
        *stacksetup=0x0202; //eflags
	stacksetup--;
        *stacksetup= 0x1b;//a user level CS, 0x08 was ring 0, put it back in and the whole thing works
	stacksetup--;
	*stacksetup=entrypoint;//ip

	//Mimics pushad
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;
	stacksetup--;
	*stacksetup = 0;

	//push segments
	stacksetup--;
	*stacksetup=0x23;
	stacksetup--;
	*stacksetup=0x23;
	stacksetup--;
	*stacksetup=0x23;
	stacksetup--;
	*stacksetup=0x23;
	
	p->KStackPointer = (unsigned long) stacksetup;

	write_cr3(oldpd);
	return;
}

i then can start the task running using the following:

Code: Select all

	VGADriver::putStr("Starting Multitasking with Process: ");
	VGADriver::print(CurrentProcess->ProcessName);

	write_cr3(CurrentProcess->PageDirectoryAddress);
	asm ("movl %0, %%esp\n\t" : :"r"(CurrentProcess->KStackPointer));

	asm("pop %ds\n\t");
	asm("pop %es\n\t");
	asm("pop %fs\n\t");
	asm("pop %gs\n\t");

	asm("popa\n\t");

	asm("iret\n\t");
Now as some of you will have spotted from the comments this works fine if i am attempting to start a supervisor level task, i use a ring 0 CS descriptor (0x08) and the task will start perfectly. However, if i try using a user level code descriptor(0x1b) to start a ring 3 task i get a general protection fault :-(. Fortuantltly i am using bochs to develop my OS so i can give you the logfile outpt from the crash:

Code: Select all

00024214632i[CPU0 ] >> push ebp : 55
00024214632p[CPU0 ] >>PANIC<< exception(): 3rd (10) exception with no resolution
00024214632i[CPU0 ] CPU is in protected mode (active)
00024214632i[CPU0 ] CS.d_b = 32 bit
00024214632i[CPU0 ] SS.d_b = 32 bit
00024214632i[CPU0 ] | EAX=00000000  EBX=00000000  ECX=00000000  EDX=00000000
00024214632i[CPU0 ] | ESP=ff9fe000  EBP=00000000  ESI=00000000  EDI=00000000
00024214632i[CPU0 ] | IOPL=0 id vip vif ac vm RF nt of df IF tf sf zf af pf cf
00024214632i[CPU0 ] | SEG selector     base    limit G D
00024214632i[CPU0 ] | SEG sltr(index|ti|rpl)     base    limit G D
00024214632i[CPU0 ] |  CS:001b( 0003| 0|  3) 00000000 000fffff 1 1
00024214632i[CPU0 ] |  DS:0023( 0004| 0|  3) 00000000 000fffff 1 1
00024214632i[CPU0 ] |  SS:0023( 0004| 0|  3) 00000000 000fffff 1 1
00024214632i[CPU0 ] |  ES:0023( 0004| 0|  3) 00000000 000fffff 1 1
00024214632i[CPU0 ] |  FS:0023( 0004| 0|  3) 00000000 000fffff 1 1
00024214632i[CPU0 ] |  GS:0023( 0004| 0|  3) 00000000 000fffff 1 1
00024214632i[CPU0 ] | EIP=00900053 (00900053)
00024214632i[CPU0 ] | CR0=0xe0000011 CR1=0 CR2=0x00900000
00024214632i[CPU0 ] | CR3=0x01b9c000 CR4=0x00000000
00024214632i[CPU0 ] >> push ebp : 55
00024214632i[CMOS ] Last time is 1222986078 (Thu Oct  2 23:21:18 2008)
00024214632i[XGUI ] Exit
Since it works for supervisor level return but not for user level i am guessing the problem will be due to a priveledge violation. I think either i havent correctly setup the user level code segment of my GDT correctly or the fault is being cause by the fact that the stack location is currently mapped with supervisor page protection(but surely this would cause a pagefault not GP). I am quite happy to provide anymore code or additional output you guys need but i fell this post is long enough already(I dont want to bore people to death :-) ).

Thanks for your time,
Pinky

(P.S I had a post on this issue over a year ago but due to the elapsed time period and my better understanding of the problem i felt a new post was appropriate but i would like to thank althoughs that helped me then)
Last edited by PinkyNoBrain on Sun Oct 05, 2008 11:47 am, edited 1 time in total.
User avatar
piranha
Member
Member
Posts: 1391
Joined: Thu Dec 21, 2006 7:42 pm
Location: Unknown. Momentum is pretty certain, however.
Contact:

Re: Multi Tasking: User level task return problem

Post by piranha »

Well, the 10 means bad TSS IIRC, so maybe seeing your TSS code would help...
(Sorry about not responding much, I'm tired with lots of work to do...)

-JL
SeaOS: Adding VT-x, networking, and ARM support
dbittman on IRC, @danielbittman on twitter
https://dbittman.github.io
User avatar
cr2
Member
Member
Posts: 162
Joined: Fri Jun 27, 2008 8:05 pm
Location: ND, USA

Re: Multi Tasking: User level task return problem

Post by cr2 »

SS shouldn't be 23. Set it to the index of your data segment in your GDT. might help...
OS-LUX V0.0
Working on...
Memory management: the Pool
pcmattman
Member
Member
Posts: 2566
Joined: Sun Jan 14, 2007 9:15 pm
Libera.chat IRC: miselin
Location: Sydney, Australia (I come from a land down under!)
Contact:

Re: Multi Tasking: User level task return problem

Post by pcmattman »

0x20 OR'd with 0x3 is 0x23 ;)

You may need to map the IDT into the child address space as user privilege, not supervisor. You probably want to map as read-only too :P.

Also, it is actually page faulting:

Code: Select all

CR2=0x00900000
What may be happening is a PF, followed by an un-readable IDT (not privileged enough) which triggers a GPF, which triggers another GPF because it still can't read the IDT. The third GPF is your triple fault.
rdos
Member
Member
Posts: 3329
Joined: Wed Oct 01, 2008 1:55 pm

Re: Multi Tasking: User level task return problem

Post by rdos »

Generally, the stack is not used for multitasking in IA-32. You create a new TSS and jump to it. You probably can make it work this way as well, but the stackframe to return to user ring is different from returning to kernel ring. Part of the ring switch is to save SS:ESP on the new stack (also segment registers if it is a V86 task).
User avatar
AJ
Member
Member
Posts: 2646
Joined: Sun Oct 22, 2006 7:01 am
Location: Devon, UK
Contact:

Re: Multi Tasking: User level task return problem

Post by AJ »

rdos wrote:Generally, the stack is not used for multitasking in IA-32.
:shock:
I assume by this that you mean that everyone still uses the old, legacy hardware task switching method. They don't.

@OP: You posted a Bochs dump which says about the 3rd Exception, but I think just before that, there should be a reason for the GPF (Bochs is generally pretty good about giving GPF reasons). If so, is there any chance of seeing a few more lines up your Output file?

About the PFE: Remember that if you are running ring 3 code in a paging environment, the U/S flag in the PTE's and PDE's need to be set for your user mode code, stack, heap and any other accessed data. This should not need to run to the system descriptor tables (GDT/IDT) which should be somewhere in a page with the U/S but clear.

Cheers,
Adam
rdos
Member
Member
Posts: 3329
Joined: Wed Oct 01, 2008 1:55 pm

Re: Multi Tasking: User level task return problem

Post by rdos »

AJ wrote: :shock:
I assume by this that you mean that everyone still uses the old, legacy hardware task switching method. They don't.
They don't? :lol:

Why wouldn't you use the "old legacy" task switching method? Especially if there is V86-code in the system?

Also given that the TSS contains the ring 0, 1, 2 stacks and the CR3.

Will you tell me next that you don't support multitasking in kernel either, because if you do, you definitely will need a separate ring 0 stack for all user-mode threads.
User avatar
AJ
Member
Member
Posts: 2646
Joined: Sun Oct 22, 2006 7:01 am
Location: Devon, UK
Contact:

Re: Multi Tasking: User level task return problem

Post by AJ »

rdos wrote:They don't? :lol:
Nope. I (like many others) load a single TSS for each CPU core and use software task switching which is supported in both protected mode and long mode. Hardware task switching is not supported in long mode. Software task switching also gives you the option to save only the state information that needs saving. Even in hardware task switching, you can't purely rely on a TSS - think about MMX and SSE registers.

Hardware task switching also has a maximum theoretical limit of 8187 tasks (after you have ring 0 and 3 code and data segment descriptors). I don't intend to have this many tasks running at a time, but why even apply a theoretical maximum when there doesn't need to be one? You can of course get around this maximum by the slow process of copying your TSS manually for each task switch.
Why wouldn't you use the "old legacy" task switching method? Especially if there is V86-code in the system?
Firstly, I do not use v86 in my kernel - anything that needs to run in real mode runs in real mode at boot time. Secondly, v86 is not supported in 64 bit modes anyway. [EDIT: Note that it is still quite possible to support v86 mode with a single TSS (software task switching) in protected mode.]
Also given that the TSS contains the ring 0, 1, 2 stacks and the CR3.
I didn't think anyone used rings 1 and 2 any more. Segmentation is a legacy feature and paging only recognises User and Supervisor tasks anyway. As for CR3, that's stored in my Process Control Block.
Will you tell me next that you don't support multitasking in kernel either, because if you do, you definitely will need a separate ring 0 stack for all user-mode threads.
Quite right, but you do not even need to update SS0 and ESP0 if you set up your paging carefully.

I am aware that some people still continue to use hardware task switching, but it is certainly incorrect to say "Generally, the stack is not used for multitasking in IA-32." .

Cheers,
Adam
rdos
Member
Member
Posts: 3329
Joined: Wed Oct 01, 2008 1:55 pm

Re: Multi Tasking: User level task return problem

Post by rdos »

AJ wrote: Nope. I (like many others) load a single TSS for each CPU core and use software task switching which is supported in both protected mode and long mode. Hardware task switching is not supported in long mode. Software task switching also gives you the option to save only the state information that needs saving. Even in hardware task switching, you can't purely rely on a TSS - think about MMX and SSE registers.

Hardware task switching also has a maximum theoretical limit of 8187 tasks (after you have ring 0 and 3 code and data segment descriptors). I don't intend to have this many tasks running at a time, but why even apply a theoretical maximum when there doesn't need to be one? You can of course get around this maximum by the slow process of copying your TSS manually for each task switch.
Not a problem, but I suppose this will depend on the intention of the OS. I decided early on to use assembly in the kernel, and segmentation for protection. I added paging later because I also decided to support flat memory model in applications. Nowadays, all application code uses flat memory model, but the kernel doesn't. Each device driver runs in its own code-segment, and usually has its own data segment as well. Because of this design segment registers always needs to be saved in the task-switch, so I think the best method is to use the hardware task-switch method.

I also do not intend to (ever) support the IA-64 model. Basically because I do not target large servers / databases that needs enormous amounts of memory and address space.
AJ wrote: Secondly, v86 is not supported in 64 bit modes anyway.
The 64-bit modes are generally incompatible with existing x86 assembly-code anyway, so I see no reason to bother with it. AMD did a much better job on the 64-bit extension than Intel did, but it still breaks a lot of the existing code-base, which is why I don't intend to support it.
AJ wrote: I am aware that some people still continue to use hardware task switching, but it is certainly incorrect to say "Generally, the stack is not used for multitasking in IA-32." .
OK, point taken. :D
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Multi Tasking: User level task return problem

Post by Brendan »

Hi,
AJ wrote:Hardware task switching also has a maximum theoretical limit of 8187 tasks (after you have ring 0 and 3 code and data segment descriptors). I don't intend to have this many tasks running at a time, but why even apply a theoretical maximum when there doesn't need to be one? You can of course get around this maximum by the slow process of copying your TSS manually for each task switch.
It's possible to have 2 TSS descriptors (per CPU) in the GDT and dynamically change the base address in the TSS descriptor that isn't being used before each task switch. Basically the TSS descriptor in the GDT is a pointer that can point to any number of TSSs (as long as it points to a TSS that's in memory before the task switch occurs - you can send TSSs to swap when they're not in use). With one GDT shared by all CPUs this does involve a "8197/2 CPUs" limit, but you can use more than one GDT to get around that (e.g. have 20 GDTs where 4000 CPUs share each GDT). :D
rdos wrote:I also do not intend to (ever) support the IA-64 model. Basically because I do not target large servers / databases that needs enormous amounts of memory and address space.
As an assembly programmer you'd understand that (currently) the biggest advantage for 64-bit code is the additional registers and additional register width. Basically "64-bit" is about performance, not about addressable space.
rdos wrote:The 64-bit modes are generally incompatible with existing x86 assembly-code anyway, so I see no reason to bother with it. AMD did a much better job on the 64-bit extension than Intel did, but it still breaks a lot of the existing code-base, which is why I don't intend to support it.
In long mode, a 64-bit kernel can happily execute code designed for 16-bit and 32-bit protected mode, including segmentation. In your case it's possible that there's inadequate isolation between the kernel and CPL=0 device drivers, so you'd need some ugly thunking (or do what Microsoft did and just use 64-bit device drivers to avoid the need to bother with thunking). None of this should effect applications though (CPL=3 processes shouldn't need to care if the kernel is 32-bit or 64-bit).

Note: there's 2 problems with BIOS/ROM functions - they're 16-bit and they're crap. V86 only solves one of those problems.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
PinkyNoBrain
Posts: 22
Joined: Mon Oct 29, 2007 10:49 am

Re: Multi Tasking: User level task return problem

Post by PinkyNoBrain »

Hey guys, thanks for the speedy response. Ill attempt to deal with the questions raised in order:

So firstly piranha requested that i post my TSS code, my gdt code is based around the brans kernel development tutorial which i have extended to use TSSes myself. Therefor it is highly possible i have made a mistake in it. The following is my code to add a TSS to my GDT:

Code: Select all

//extra utility function that acts as wrapper around
//gdtSetGate specifically for creating a TSS 

tss_load(0);

void tss_load(unsigned long cpunum)
{
	//unsigned long base = 0x10<<4;
	long base = (unsigned long)&SysTSS;
	int size = 105;
	gdtSetGate(5+cpunum,base,size,0x09|0x80,0);
}

	//load tss in task register
	unsigned long r = 0x28;
   	asm("ltr %0"::"m"(r));


Well spotted pcmattman and otheres who have mentioned page permissions, cr2 is indeed being set which indicates a page fault. It is a shame i cant see the last value on the stack because that would allow us to determine the type of fault. The fault address does telll us some interesting stuff though. My kernel itself is mapped into the bottom 8mb of memory so up to address 0x800000 would be in the kernel proper. The stack is mapped to very high addresses, so this fault seems to be occuring in normal heap space. I am guessing since it works for system level code it must be a permissions issue. I will check all my page permissions and get back to you. I am however confused as to why my pagefault handler isnt getting the interrupt if its originally a pagefault problem :-(


I create a user level data segment as the fourth entry in my gdt using the following code(which is based on brans kernel development):

Code: Select all

	//create user data segment
	gdtSetGate(4, 0, 0xFFFFFFFF,0xF2, 0xCF);
I believe that 0x23 is therefor correct for selecting this fourth segment with user level permisions

AJ asked to see more of the bochs dump, here are the next few lines of log text above my last post

Code: Select all

00005955654i[BIOS ] *** int 15h function AX=00C0, BX=0000 not yet supported!
00023288600e[CPU0 ] interrupt(): SS selector null
00023288600e[CPU0 ] interrupt(): SS selector null
00023288600e[CPU0 ] interrupt(): SS selector null
00023288600i[CPU0 ] CPU is in protected mode (active)
00023288600i[CPU0 ] CS.d_b = 32 bit
00023288600i[CPU0 ] SS.d_b = 32 bit
00023288600i[CPU0 ] | EAX=00000000  EBX=00000000  ECX=00000000  EDX=00000000
00023288600i[CPU0 ] | ESP=ff9fe000  EBP=00000000  ESI=00000000  EDI=00000000
00023288600i[CPU0 ] | IOPL=0 id vip vif ac vm RF nt of df IF tf sf zf af pf cf
00023288600i[CPU0 ] | SEG selector     base    limit G D
00023288600i[CPU0 ] | SEG sltr(index|ti|rpl)     base    limit G D
00023288600i[CPU0 ] |  CS:001b( 0003| 0|  3) 00000000 000fffff 1 1
00023288600i[CPU0 ] |  DS:0023( 0004| 0|  3) 00000000 000fffff 1 1
00023288600i[CPU0 ] |  SS:0023( 0004| 0|  3) 00000000 000fffff 1 1
00023288600i[CPU0 ] |  ES:0023( 0004| 0|  3) 00000000 000fffff 1 1
00023288600i[CPU0 ] |  FS:0023( 0004| 0|  3) 00000000 000fffff 1 1
00023288600i[CPU0 ] |  GS:0023( 0004| 0|  3) 00000000 000fffff 1 1
00023288600i[CPU0 ] | EIP=00900053 (00900053)
00023288600i[CPU0 ] | CR0=0xe0000011 CR1=0 CR2=0x00900000
00023288600i[CPU0 ] | CR3=0x01b9c000 CR4=0x00000000
00023288600i[CPU0 ] >> push ebp : 55
00023288600p[CPU0 ] >>PANIC<< exception(): 3rd (10) exception with no resolution
00023288600i[CPU0 ] CPU is in protected mode (active)
00023288600i[CPU0 ] CS.d_b = 32 bit
00023288600i[CPU0 ] SS.d_b = 32 bit
00023288600i[CPU0 ] | EAX=00000000  EBX=00000000  ECX=00000000  EDX=00000000
00023288600i[CPU0 ] | ESP=ff9fe000  EBP=00000000  ESI=00000000  EDI=00000000
00023288600i[CPU0 ] | IOPL=0 id vip vif ac vm RF nt of df IF tf sf zf af pf cf
00023288600i[CPU0 ] | SEG selector     base    limit G D
00023288600i[CPU0 ] | SEG sltr(index|ti|rpl)     base    limit G D
00023288600i[CPU0 ] |  CS:001b( 0003| 0|  3) 00000000 000fffff 1 1
00023288600i[CPU0 ] |  DS:0023( 0004| 0|  3) 00000000 000fffff 1 1
00023288600i[CPU0 ] |  SS:0023( 0004| 0|  3) 00000000 000fffff 1 1
00023288600i[CPU0 ] |  ES:0023( 0004| 0|  3) 00000000 000fffff 1 1
00023288600i[CPU0 ] |  FS:0023( 0004| 0|  3) 00000000 000fffff 1 1
00023288600i[CPU0 ] |  GS:0023( 0004| 0|  3) 00000000 000fffff 1 1
00023288600i[CPU0 ] | EIP=00900053 (00900053)
00023288600i[CPU0 ] | CR0=0xe0000011 CR1=0 CR2=0x00900000
00023288600i[CPU0 ] | CR3=0x01b9c000 CR4=0x00000000
00023288600i[CPU0 ] >> push ebp : 55
I hope this sheds more light on the situation :-)

P.S Is it possible that the value in cr2 is left over from a previous page fault or does it definetly mean the pagefault occured in the current fault sequence?
User avatar
AJ
Member
Member
Posts: 2646
Joined: Sun Oct 22, 2006 7:01 am
Location: Devon, UK
Contact:

Re: Multi Tasking: User level task return problem

Post by AJ »

Hi,

Code: Select all

 interrupt(): SS selector null
I would say that definitely sheds more light :)
What is the value of SS0 in your TSS (and are your TSS fields where you think they are?).

Cheers,
Adam
PinkyNoBrain
Posts: 22
Joined: Mon Oct 29, 2007 10:49 am

Re: Multi Tasking: User level task return problem

Post by PinkyNoBrain »

Hey AJ, sry i should have said before but the code just before the task starting code in my original post above is the following:

Code: Select all

	SysTSS[get_ProcessorNumber()].esp0 = CurrentProcess->KStackPointer;
	SysTSS[get_ProcessorNumber()].cr3  = CurrentProcess->PageDirectoryAddress;

	write_cr3(CurrentProcess->PageDirectoryAddress);
	asm ("movl %0, %%esp\n\t" : :"r"(CurrentProcess->KStackPointer));
Get processor number simply returns 0 for the moment so i believe that should set the current CPU's TSS to have the processes kernel stack pointer and page directory info. I hope this is ok

thanks,
Chris
User avatar
AJ
Member
Member
Posts: 2646
Joined: Sun Oct 22, 2006 7:01 am
Location: Devon, UK
Contact:

Re: Multi Tasking: User level task return problem

Post by AJ »

Hi,

Do you have anything like:

Code: Select all

SysTSS[get_ProcessorNumber()].ss0 = 0x10;
?

Cheers,
Adam
PinkyNoBrain
Posts: 22
Joined: Mon Oct 29, 2007 10:49 am

Re: Multi Tasking: User level task return problem

Post by PinkyNoBrain »

Hey,

No i dont have anything like that. I was under the impression only esp0 and cr3 needed to be set?

Thanks,
Chris
Post Reply