Weird problem in my scheduler

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
Jabus
Member
Member
Posts: 39
Joined: Sun Jan 07, 2007 7:54 am

Weird problem in my scheduler

Post by Jabus »

I've been working on software multitasking for sometime now and I have finally finished it. My scheduler allows each task to run for the same number of timer ticks as its priority e.g. a task with priority 10 will run for 10 timer ticks. I then added the ability for a task to call REGISTER_ATOMIC() before it started an atomic operation. The scheduler wouldn't increment the time a task has been running for until UNREGISTER_ATOMIC() was called. I also added a pause function which did the exact same thing and the amount of time the task had been running for would only increment after the unpause function had been called.

The above all worked completely fine in BOCHS and as long as I didn't call the register atomic or the pause functions and left it paused it worked fine on real hardware. However as soon as I called the pause function and left it paused on the main task everything went wrong. Despite this working in BOCHS on real hardware the computer triple faulted. Everything was like BOCHS and then about half a second later the computer triple faulted.

I have been stuck on this for about 2 weeks now. I would be very grateful if you could have a look at my code and tell me what is happening.

Code: Select all

void REGISTER_ATOMIC()
{
    SCHEDULE_MODIFIER=SCHEDULE_MODIFIER | ATOMIC;
}
void UNREGISTER_ATOMIC()
{
    SCHEDULE_MODIFIER=SCHEDULE_MODIFIER ^ ATOMIC;
}
void PAUSE_ON_ME()
{
    SCHEDULE_MODIFIER=SCHEDULE_MODIFIER | PAUSE_ON;
}
void UNPAUSE()
{
    SCHEDULE_MODIFIER=SCHEDULE_MODIFIER ^ PAUSE_ON;
}

void schedule()
{
    if(proc_on==0)
    {
        t=&(tList[proc_on]);
        proc_on++;
    }
    else if(proc_num==0)
    {
        //PANIC!!!!!!!
    }
    else if(proc_num==1)
    {
        //Do relatively little
    }
    else if(proc_num>1)
    {


        //printStr("c");
        if(SCHEDULE_MODIFIER==NORMAL)
        {
            t->task_timeRun++;
            if(t->task_timeRun>t->task_priority)
            {

                t->task_timeRun=0;
                proc_run++;
                //printStr("f");
                if(proc_run>=proc_num)
                {
                    //printStr("e");
                    proc_run=0;
                }
                t=&(tList[proc_run]);
            }
        }
        else if(SCHEDULE_MODIFIER!=NORMAL)
        {
            //do stuff
        }
    }


}
User avatar
JamesM
Member
Member
Posts: 2935
Joined: Tue Jul 10, 2007 5:27 am
Location: York, United Kingdom
Contact:

Post by JamesM »

To be honest, there is almost certainly going to be no difference between the way bochs and a real machine executes that code. It has no stack fiddling, no real bit twiddling and no asm!

I'd be more interested in seeing your task-switching code. That'd be FAR more likely to cause a triple fault than a simple if statement. (If you notice that code you posted is just updating kernel internal variables - nothing that can affect the operation of the computer at all.)

And as a helpy-hint, One of the major major differences between bochs and a real machine is that boch zero's all its memory. Check you're not relying on that. And if you're using your own bootloader, check you're zero'ing the .bss section.

JamesM
pcmattman
Member
Member
Posts: 2566
Joined: Sun Jan 14, 2007 9:15 pm
Libera.chat IRC: miselin
Location: Sydney, Australia (I come from a land down under!)
Contact:

Post by pcmattman »

JamesM wrote:To be honest, there is almost certainly going to be no difference between the way bochs and a real machine executes that code. It has no stack fiddling, no real bit twiddling and no asm!
Actually, Bochs often executes (properly) buggy code. It's emulation is not dependable. This is why it is suggested to test in multiple emulators (I work with QEMU, Bochs, and Virtual PC), and on real hardware.

Something like a "memset( <var>, 0, <size> );" often helps.
User avatar
JamesM
Member
Member
Posts: 2935
Joined: Tue Jul 10, 2007 5:27 am
Location: York, United Kingdom
Contact:

Post by JamesM »

@pcmattman:

But the code he posted is just setting some local variables! Theres nothing there that could cause a fault - there is only one indirection, thats for the variable t, which I assume is non-NULL. If t is undefined then bochs could differ with a real machine. But if not, there is nothing there that should differ.
And bochs tends to (for me, anyway) give me an illegal opcode fault when it hits an illegal instruction. If that is not handled, it panics.

@OP:

I would suggest looking elsewhere in your code - the error could be there and just appearing to be here.
Jabus
Member
Member
Posts: 39
Joined: Sun Jan 07, 2007 7:54 am

Post by Jabus »

The task switching itself is fine. However when I call either the register_atomic or the pause functions and I leave the schedule_modifier variable altered the computer triple faults after a seemingly random number of interrupts (number of interrupts differs each time) however it doesn't triple fault in BOCHS.

Here is my task switching code if you would care to look at it and the whole of my scheduler is at the bottom.

Code: Select all

extern _t
extern _schedule
extern _sys_tss
extern _kStackEnd
_irq0:
	cli
	pushad
	push gs
	push fs
	push es
	push ds
    	mov eax,0x10
    	mov es,eax
    	mov fs,eax
    	mov gs,eax
    	mov ds,eax
    	mov eax,[_t]
    	mov [eax],esp
    	xor ebx,ebx
    	mov ebx,[_kStackEnd]
    	mov eax,[ebx]   	
    	mov esp,eax
    	call _schedule
    	mov eax,[_t]
    	mov esp,[eax]
    	mov ebx,[eax+8]
    	mov [_sys_tss+4],ebx
    	mov al,0x20
    	out 0x20,al
    	pop ds
    	pop gs
    	pop fs
    	pop es
    	popad
    	iretd

Code: Select all

#include <JBOS.h>
#define ATOMIC 1
#define NORMAL 0
#define SHAREDMEM_REQUEST 2
#define RESERVED 4
#define PAUSE_ON 8
int mins;
int secs;
int proc_num=0;
int proc_on=0;
int proc_run=0;
task_refer *on;
task_refer *starter;
volatile task_refer *curr;
volatile task_struct *t;
task_struct *tList;
task_struct *p;
unsigned char SCHEDULE_MODIFIER;
void ltr(unsigned short selector)
{
  asm ("ltr %0": :"r" (selector));
}
unsigned int *kStackEnd;
extern tss_t sys_tss;

void install_multi()
{
    int divisor = 1193180 / 100;      
    outportb(0x43, 0x36);             
    outportb(0x40, divisor & 0xFF);   
    outportb(0x40, divisor >> 8);    

    kStackEnd=(unsigned int *)malloc(sizeof(unsigned int)*1024);
    sys_tss.ss0=0x10;
    kStackEnd=&(kStackEnd[1023]);
    sys_tss.esp0=(unsigned int)kStackEnd;
    t=(task_struct *)malloc(sizeof(task_struct));
    tList=(task_struct *)malloc((5*sizeof(task_struct)));
    ltr(0x18);
    SCHEDULE_MODIFIER=0;
}
void createKTask(unsigned int entryPoint) //must be edited after I can prove this is software multitasking. It really should be createKTask
{






    unsigned int *stacksetup;//the KERNEL stack for this process
    unsigned int *ustack;//the USER stack for this process (not used at this opoint in time due to some cpl and dpl complications
    stacksetup=(unsigned int *)malloc(sizeof(unsigned int)*512);//should be enoguh memory
    stacksetup=&(stacksetup[511]);//points to top of stack STACK GOES DOWN (could place user ss here)
    tList[proc_num].kStackTop=(unsigned int)stacksetup;
    stacksetup--;//could place user esp here)
    *stacksetup--=0x0202;//eflags
    *stacksetup--=0x08;//cs
    *stacksetup--=(unsigned int )entryPoint; //Process entry
    *stacksetup--=0;    //ebp
    *stacksetup--=0;    //esp
    *stacksetup--=0;    //edi
    *stacksetup--=0;    //esi
    *stacksetup--=0;    //edx
    *stacksetup--=0;    //ecx
    *stacksetup--=0;    //ebx
    *stacksetup--=0;    //eax
    *stacksetup--=0x10; //ds
    *stacksetup--=0x10; //es
    *stacksetup--=0x10; //fs
    *stacksetup=  0x10; //gs
    ustack=(unsigned int *)malloc(sizeof(unsigned int)*512);//allocate the user stack
    ustack=&(ustack[511]);//set ustack to the TOP of the stack. stack goes DOWN
    tList[proc_num].kern_esp=(unsigned int)stacksetup;
    tList[proc_num].task_esp=(unsigned int)ustack;
    tList[proc_num].task_id=proc_num;
    tList[proc_num].task_sleep=0;
    tList[proc_num].task_priority=5;
    tList[proc_num].task_timeRun=0;
    tList[proc_num].memTest=0;
    tList[proc_num].result=0;
    tList[proc_num].equals=0;
    tList[proc_num].dataAddress=0;
    proc_num++;
}

void REGISTER_ATOMIC()
{
    SCHEDULE_MODIFIER=SCHEDULE_MODIFIER | ATOMIC;
}
void UNREGISTER_ATOMIC()
{
    SCHEDULE_MODIFIER=SCHEDULE_MODIFIER ^ ATOMIC;
}
void PAUSE_ON_ME()
{
    SCHEDULE_MODIFIER=SCHEDULE_MODIFIER | PAUSE_ON;
}
void UNPAUSE()
{
    SCHEDULE_MODIFIER=SCHEDULE_MODIFIER ^ PAUSE_ON;
}
volatile unsigned int run;
volatile unsigned int runFor;
void schedule()
{
    if(proc_on==0)
    {
        t=&(tList[proc_on]);
        proc_on++;
    }
    else if(proc_num==0)
    {
        //PANIC!!!!!!!
    }
    else if(proc_num==1)
    {
        //Do relatively little
    }
    else if(proc_num>1)
    {


        //printStr("c");
        if(SCHEDULE_MODIFIER==NORMAL)
        {
            t->task_timeRun++;
            if(t->task_timeRun>t->task_priority)
            {

                t->task_timeRun=0;
                proc_run++;
                //printStr("f");
                if(proc_run>=proc_num)
                {
                    //printStr("e");
                    proc_run=0;
                }
                t=&(tList[proc_run]);
            }
        }
        else if(SCHEDULE_MODIFIER!=NORMAL)
        {
            printStr("a");
            //do stuff
        }
    }


}
davidv1992
Member
Member
Posts: 223
Joined: Thu Jul 05, 2007 8:58 am

Post by davidv1992 »

Just because of the fact it does only happen when it is paused, i would think it is the piece of code executed then that actualy causes the problem.
User avatar
jerryleecooper
Member
Member
Posts: 233
Joined: Mon Aug 06, 2007 6:32 pm
Location: Canada

Post by jerryleecooper »

User esp is not an option.
pcmattman
Member
Member
Posts: 2566
Joined: Sun Jan 14, 2007 9:15 pm
Libera.chat IRC: miselin
Location: Sydney, Australia (I come from a land down under!)
Contact:

Post by pcmattman »

jerryleecooper wrote:User esp is not an option.
It is if you're in ring0. This code is trying to make a ring0 task:
the USER stack for this process (not used at this point in time due to some cpl and dpl complications
User avatar
jerryleecooper
Member
Member
Posts: 233
Joined: Mon Aug 06, 2007 6:32 pm
Location: Canada

Post by jerryleecooper »

Im a bit perplexed. I tought

Code: Select all

stacksetup--;//could place user esp here) 
was an error because I had nealy the same thing. I changed it for the value of my user stack and now my own scheduler work on all emulators, virtual pc, box, qemu. I just read somewhere that iret just pop 3 registers? :?: How is that possible? I was sure I targeted the bug.
User avatar
JamesM
Member
Member
Posts: 2935
Joined: Tue Jul 10, 2007 5:27 am
Location: York, United Kingdom
Contact:

Post by JamesM »

IRET will always pop an ESP value off the stack. How can it possibly not? the idea is that after an IRET the state of the machine should be pretty identical to what it was before. Doesn't matter if a ring switch occurs, it always stores then pops the stack pointer.
User avatar
bluecode
Member
Member
Posts: 202
Joined: Wed Nov 17, 2004 12:00 am
Location: Germany
Contact:

Post by bluecode »

JamesM wrote:IRET will always pop an ESP value off the stack. [...]. Doesn't matter if a ring switch occurs, it always stores then pops the stack pointer.
That is wrong. Look into the Intel Manuals (5.12.1 Exception- or Interrupt-Handler Procedures). SS/ESP are only pushed, when the handler procedure is going to be executed at a numerically lower privilege level.
User avatar
JamesM
Member
Member
Posts: 2935
Joined: Tue Jul 10, 2007 5:27 am
Location: York, United Kingdom
Contact:

Post by JamesM »

oh yeah :P D'oh! wasn't thinking straight. Just got into work and haven't had my coffee yet - seems IrnBru doesn't quite have the same effect!

Took me 5 minutes poring over the pseudocode for the INTn instruction for me to realise it though. I may be getting old....

JamesM
Post Reply