Multitasking problems - Task queue corruption

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
User avatar
finarfin
Member
Member
Posts: 106
Joined: Fri Feb 23, 2007 1:41 am
Location: Italy & Ireland
Contact:

Multitasking problems - Task queue corruption

Post by finarfin »

Hi all,
i'm developing a basic task switching mechanism in my os. But i'm stuck with a problem on the scheduler/task switching.

I decided to use a FIFO queue as structure to hold processes. I implemented using linked list.

For the task switching i decided to use the iret method to switch from a task to another

But now i have a problem. I'll try to explain it!
When the os start it launch two tasks:

idle
shell

And with these two i have no problem. But if i try to launch two other tasks (with a simply printf inside), the task queue was corrupted.

If after that i try to print the content of the task queue it contains only two tasks that are the 2 just created and with idle and shell disappeared, but the os continues to work (i think that in a specific moment the esp field of the new tasks was replaced with the esp content of the shell).

The task data structure is:

Code: Select all

typedef struct task_t{
	pid_t pid;	
	char name[NAME_LENGTH];
	void (*start_function)();
	task_state status;
	task_register_t *registers;
	unsigned int cur_quants;
	unsigned int eip;
	long int esp;
	unsigned int pdir;
	unsigned int ptable;
	struct task_t *next;
}task_t;
And the tss function is:

Code: Select all

typedef struct {	
        unsigned int edi;   //+0
        unsigned int esi;   //+1
        unsigned int ebp;  //+2
        unsigned int esp;  //+3 (can be null)
        unsigned int ebx;  //+4
        unsigned int edx;  //+5
        unsigned int ecx;  //+6
        unsigned int eax;  //+7       
        unsigned int eip;  //+8
        unsigned int cs;   //+9
        unsigned int eflags;  //+10
        unsigned int end;    		
} task_register_t;
The scheduler function:

Code: Select all

void schedule(unsigned int *stack){
    asm("cli");
    if(active == TRUE){
      task_t* cur_task = dequeue_task();      
      if(cur_task != NULL){
        cur_pid = cur_task->pid;
        dbg_bochs_print("@@@@@@@");
        dbg_bochs_print(cur_task->name);
        if(cur_task->status!=NEW){
          cur_task->esp=*stack;
        } else {
          cur_task->status=READY;         
          ((task_register_t *)(cur_task->esp))->eip = cur_task->eip;                  
        }
        enqueue_task(cur_task->pid, cur_task);
        cur_task=get_task();
        if(cur_task->status==NEW){
          cur_task->status=READY;
        }
        dbg_bochs_print(" -- ");
        dbg_bochs_print(cur_task->name);
        dbg_bochs_print("\n");        
        //load_pdbr(cur_taskp->pdir);
        *stack = cur_task->esp;
      } else {
        enqueue_task(cur_task->pid, cur_task);
      }
    }
    active = FALSE;
    return;
    asm("sti");
}
And the tss initialiaziation is:

Code: Select all

void new_tss(task_register_t* tss, void (*func)()){
    tss->eax=0;    
    tss->ebx=0;
    tss->ecx=0;
    tss->edx=0;
    tss->edi =0;
    tss->esi =0;
    tss->cs = 8;
    tss->eip = (unsigned)func;
    tss->eflags = 0x202;
    tss->end = (unsigned) suicide;
    //tss->fine = (unsigned)end; //per metterci il suicide  
    return;
}
The function that i use to create a new task is the following:

Code: Select all

pid_t new_task(char *task_name, void (*start_function)()){
    asm("cli"); 
    task_t *new_task;
    table_address_t local_table;
    unsigned int new_pid = request_pid();   
    new_task = (task_t*)kmalloc(sizeof(task_t));    
    strcpy(new_task->name, task_name);
    new_task->next = NULL;
    new_task->start_function = start_function;
    new_task->cur_quants=0;
    new_task->pid = new_pid;
    new_task->eip = (unsigned int)start_function;
    new_task->esp = (unsigned int)kmalloc(STACK_SIZE) + STACK_SIZE-100;
    new_task->status = NEW;
    new_task->registers = (task_register_t*)new_task->esp;
    new_tss(new_task->registers, start_function);
    local_table = map_kernel();
    new_task->pdir = local_table.page_dir;
    new_task->ptable = local_table.page_table;
    enqueue_task(new_task->pid, new_task);
    asm("sti");
    return new_pid;
}
'm sure that i just forgot something, or i miss some consideration. But i cannot figure what i'm missing.

Actually i'm working only in kernel mode, and inside the same address space (pagiing is enabled, but actually i use the same pagedir for all tasks).

The ISR macros are defined here: https://github.com/inuyasha82/DreamOs/b ... handlers.h

I declared four kinds of function in order to handle ISR:
  • EXCEPTION
  • EXCEPTION_EC (an exception with an error code)
  • IRQ
  • SYSCALL
Obviously the scheduler is called by an IRQ routine, so the macro looks like:

Code: Select all

__asm__("INT_"#n":"\
        "pushad;" \
        "movl %esp, %eax;"\
    "pushl %eax;"\
        "call _irqinterrupt;"\
        "popl %eax;"\
        "movl %eax, %esp;"\
        "popad;"\
        "iret;")
And as you can see that macro calls the _irqinterrupt function, that is:

Code: Select all

void _irqinterrupt(unsigned int esp){
    asm("cli;");
    int irqn;
    irqn = get_current_irq();  
    IRQ_s* tmpHandler; 
    if(irqn>=0) {               
        tmpHandler = shareHandler[irqn];        
        if(tmpHandler!=0) {
            tmpHandler->IRQ_func();         
            #ifdef DEBUG
                printf("2 - IRQ_func: %d, %d\n", tmpHandler->IRQ_func, tmpHandler);
            #endif
            while(tmpHandler->next!=NULL) {
                tmpHandler = tmpHandler->next;                           
                #ifdef DEBUG
                    printf("1 - IRQ_func (_prova): %d, %d\n", tmpHandler->IRQ_func, tmpHandler);
                #endif
                if(tmpHandler!=0) tmpHandler->IRQ_func();
            }
      } else printf("irqn: %d\n", irqn);
    }
    else printf("IRQ N: %d E' arrivato qualcosa che non so gestire ", irqn);
    if(irqn<=8 && irqn!=2) outportb(MASTER_PORT, EOI);
    else if(irqn<=16 || irqn==2){     
      outportb(SLAVE_PORT, EOI);
      outportb(MASTER_PORT, EOI);
    }

    schedule(&esp);
    asm("sti;");
    return;
}
And finally the enqueue_task and dequeue_task functions are:

Code: Select all

void enqueue_task(pid_t pid, task_t* n_task){
  n_task->next=NULL;
  if(task_list.tail == NULL){
    task_list.head = n_task;
    task_list.tail = task_list.head;    
  } else {
    task_list.head->next=n_task;    
    task_list.head = n_task;    
  }
}

task_t* dequeue_task(){
    if(task_list.head==NULL){
      return NULL;
    } else {
      task_t* _task;
      _task = task_list.tail; 
      task_list.tail=_task->next;
      return _task;
    }
    return;
}
Just a last consideration, when i implemented a very similar multitasking mechanism using an array instead of the queue everything seemed to works fine.
Probably there is some aspect that i didn't understand correctly or simply forgot.

I hope you can help me in order to understand where i'm wrong :)
Thanks in advance,
finarfin
Elen síla lúmenn' omentielvo
- DreamOS64 - My latest attempt with osdev: https://github.com/dreamos82/Dreamos64
- Osdev Notes - My notes about osdeving! https://github.com/dreamos82/Osdev-Notes
- My old Os Project: https://github.com/dreamos82/DreamOs
gerryg400
Member
Member
Posts: 1801
Joined: Thu Mar 25, 2010 11:26 pm
Location: Melbourne, Australia

Re: Multitasking problems - Task queue corruption

Post by gerryg400 »

Have you unit tested your linked list functions ?

It is very easy to test this type of thing and you will very quickly find out whether the problem is there or not.
If a trainstation is where trains stop, what is a workstation ?
User avatar
finarfin
Member
Member
Posts: 106
Joined: Fri Feb 23, 2007 1:41 am
Location: Italy & Ireland
Contact:

Re: Multitasking problems - Task queue corruption

Post by finarfin »

I did some tests on linked list function.
I tried to schedule between tasks element, without replaciing the esp, only to check if the tasks are taken in the correct sequence, and if i don't replace the esp register, the scheduler function seem to nvigate correctly between the tasks, even adding several tasks (more than the four that causes problems when i replace the stack).
Elen síla lúmenn' omentielvo
- DreamOS64 - My latest attempt with osdev: https://github.com/dreamos82/Dreamos64
- Osdev Notes - My notes about osdeving! https://github.com/dreamos82/Osdev-Notes
- My old Os Project: https://github.com/dreamos82/DreamOs
User avatar
iansjack
Member
Member
Posts: 4711
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Multitasking problems - Task queue corruption

Post by iansjack »

Time to dig out the debugger.
jbemmel
Member
Member
Posts: 53
Joined: Fri May 11, 2012 11:54 am

Re: Multitasking problems - Task queue corruption

Post by jbemmel »

I find your FIFO queue code hard to read. Most code I've seen adds queue items to the tail, and removes them from the head; yours switches head<=>tail

add_item (item * i):
i->next = 0;
if ( tail==0 ) {
head = tail = i;
} else {
tail->next = i;
tail = i
}

item* remove_item() {
item *r = head;
if ( r!=0 ) {
head = r->next;
if ( head==0 ) tail = 0; // !! Missing from your code !!
r->next = 0;
return r;
} else return 0; // empty queue
}
Post Reply