Multitasking problems and the stack (I think)

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
User avatar
Stevo14
Member
Member
Posts: 179
Joined: Fri Mar 07, 2008 3:40 am
Location: Arad, Romania

Multitasking problems and the stack (I think)

Post by Stevo14 »

Hello everyone, this is my first post here, I am just starting OS development.

I am currently using the code from JamesM's multitasking tutorial as a learning tool. Everything works fine until I try to return back into my main function in two scenarios.
The first is this: I call fork() in main, it fork's the kernel process and returns as the parent. The parent task runs fine for several milliseconds after which switch_task() gets called by my timer interrupt. Everything is saved and the new task (the child task) is loaded without problems. Then it page faults on the "return;" command when the child tries to go back to the main() function. Here is a screen shot of the output from this scenario:
Image

The other scenario is this: I don't call fork() in main(), the kernel process then runs for several milliseconds after which switch_task() gets called by my timer interrupt (like before). This time though the kernel task (at this point the only task) also page faults trying to "return;" back into main(). Here is a screen shot of this one:
Image

My first thought was that the stack was at fault because it would push or pop a bogus value on the return causing a page fault. But I am new at this an am probably wrong. :)
User avatar
JamesM
Member
Member
Posts: 2935
Joined: Tue Jul 10, 2007 5:27 am
Location: York, United Kingdom
Contact:

Post by JamesM »

Hi,

so, firstly, does your timer interrupt work without task switching? I.e., if you comment out the task switch code in your timer handler, does the program work? If so, your stack is not being killed by your irq handler.

If that *does* work, you need to look at your task switching code, because something is FUBAR - I suggest you try and fix scenario 2 before 1, because 2 is the more disturbing one (and a fix for 2 will possible fix 1 also).

If you're still stuck, perhaps you could post some code?

Cheers,

James
User avatar
Stevo14
Member
Member
Posts: 179
Joined: Fri Mar 07, 2008 3:40 am
Location: Arad, Romania

Post by Stevo14 »

JamesM wrote:Hi,

so, firstly, does your timer interrupt work without task switching? I.e., if you comment out the task switch code in your timer handler, does the program work? If so, your stack is not being killed by your irq handler.
Yes, my timing code works otherwise. (It will even get the current date/time from the CMOS chip and intermittently print it to the screen. :) )
But just for the sake of consistency here is the relevant portion of the timer code:

Code: Select all

void timer_handler(struct regs *r)
{
    /* Increment our 'tick count' */
    timer_ticks++;

	//every 4 clocks (40 milliseconds) we invoke the process manager's scheduler
	if (timer_ticks % 4 == 0)
   {
		//schedule the next task
		write_string("Switching tasks...");
		switch_task();
		write_string("[done]\n");
	}
"[done]" never gets printed but that is expected. It is mostly there to tell me if something went terribly wrong in switch_task() and it actually ended up returning back to the timer handler.
JamesM wrote: If that *does* work, you need to look at your task switching code, because something is FUBAR - I suggest you try and fix scenario 2 before 1, because 2 is the more disturbing one (and a fix for 2 will possible fix 1 also).

If you're still stuck, perhaps you could post some code?

Cheers,

James

I would be surprised if the problem was in the switch_task() code, being that it is identical to the switch_task() function in your tutorial (with added debugging print functions of course). I'll look over it again though just to see if I can find anything.
User avatar
JamesM
Member
Member
Posts: 2935
Joined: Tue Jul 10, 2007 5:27 am
Location: York, United Kingdom
Contact:

Post by JamesM »

What debug output have you got in switch_tasks? Could you get it to output exactly what register values it is poking? (Specifically, the EIP, ESP and EBP register values, and their values before change.)
User avatar
Stevo14
Member
Member
Posts: 179
Joined: Fri Mar 07, 2008 3:40 am
Location: Arad, Romania

Post by Stevo14 »

JamesM wrote:What debug output have you got in switch_tasks? Could you get it to output exactly what register values it is poking? (Specifically, the EIP, ESP and EBP register values, and their values before change.)
Humm... this is very strange. While I was adding more debug output to the switch_task() function it ended up working because of the debuging code that I added! It happens after loading the new eip, ebp, and esp values. The comments will explain:

Code: Select all

void switch_task()
{
   // If we haven't initialised tasking yet, just return.
   if (!current_task)
       return;

// Read esp, ebp now for saving later on.
unsigned int esp, ebp, eip;
asm volatile("mov %%esp, %0" : "=r"(esp));
asm volatile("mov %%ebp, %0" : "=r"(ebp));

// Read the instruction pointer. We do some cunning logic here:
   // One of two things could have happened when this function exits -
   // (a) We called the function and it returned the EIP as requested.
   // (b) We have just switched tasks, and because the saved EIP is essentially
   // the instruction after read_eip(), it will seem as if read_eip has just
   // returned.
   // In the second case we need to return immediately. To detect it we put a dummy
   // value in EAX further down at the end of this function. As C returns values in EAX,
   // it will look like the return value is this dummy value! (0x12345).
   eip = read_eip();

   // Have we just switched tasks?
   if (eip == 0x12345)
{
write_string("Switched back to process ");//we have a successful task switch!
write_number(getpid());
      return;
} 

// No, we didn't switch tasks. Let's save some register values and switch.
   current_task->eip = eip;
   current_task->esp = esp;
   current_task->ebp = ebp;

write_string("Before switch: esp:");
write_hex(current_task->esp);
write_string(" ebp:");
write_hex(current_task->ebp);
write_string(" eip:");
write_hex(current_task->eip);
write_string("\n");

// Get the next task to run.
   current_task = current_task->next;
   // If we fell off the end of the linked list start again at the beginning.
   if (!current_task) current_task = ready_queue;

//now reload everything with the new task
eip = current_task->eip;
   esp = current_task->esp;
   ebp = current_task->ebp; 

//uncomenting one of these lines, all of them or two or three will make it work fine
//write_string("After switch: esp:");
//write_hex(current_task->esp);
//write_string(" ebp:");
//write_hex(current_task->ebp);
//write_string(" eip:");
//write_hex(current_task->eip);
//write_string("\n");

// Make sure the memory manager knows we've changed page directory.
   current_directory = current_task->page_directory;

// Here we:
   // * Stop interrupts so we don't get interrupted.
   // * Temporarily put the new EIP location in ECX.
   // * Load the stack and base pointers from the new task struasm volatile("mov %%esp, %0" : "=r"(esp));ct.
   // * Change page directory to the physical address (physicalAddr) of the new directory.
   // * Put a dummy value (0x12345) in EAX so that above we can recognise that we've just
   // switched task.
   // * Restart interrupts. The STI instruction has a delay - it doesn't take effect until after
   // the next instruction.
   // * Jump to the location in ECX (remember we put the new EIP in there).
   asm volatile("         \
     cli;                 \
     mov %0, %%ecx;       \
     mov %1, %%esp;       \
     mov %2, %%ebp;       \
     mov %3, %%cr3;       \
     mov $0x12345, %%eax; \
     sti;                 \
     jmp *%%ecx           "
                : : "r"(eip), "r"(esp), "r"(ebp), "r"(current_directory->physicalAddr));
} 
Last edited by Stevo14 on Thu Mar 13, 2008 5:15 am, edited 1 time in total.
User avatar
JamesM
Member
Member
Posts: 2935
Joined: Tue Jul 10, 2007 5:27 am
Location: York, United Kingdom
Contact:

Post by JamesM »

have you tried putting a "cli" at the start of the function to ensure interrupts are disabled?
User avatar
Stevo14
Member
Member
Posts: 179
Joined: Fri Mar 07, 2008 3:40 am
Location: Arad, Romania

Post by Stevo14 »

JamesM wrote:have you tried putting a "cli" at the start of the function to ensure interrupts are disabled?
I just did and it didn't help any. I also figured out that, in order to make it work, I need to call some sort of function in between "esp = current_task->esp;" and the assembly code. I tried inserting different types of functions like "write_char(0x00)" or "settextcolor(9,0)". Anything will work aslong as it is between "esp = current_task->esp;" and the assembly code. Does this confirm that it is a problem with the stack?
User avatar
Stevo14
Member
Member
Posts: 179
Joined: Fri Mar 07, 2008 3:40 am
Location: Arad, Romania

Post by Stevo14 »

Ok, I've made some more progress. I figured out that when the child process enters, if I read the esp and ebp right then, the stack pointer (esp) doesn't have the value that it was given when the parent created the child.

I will work on this more but it's late, and I need sleep. :)

EDIT: OK, I now have partial success. By this I mean that it does work, but not for the right reasons. (e.g. I hacked it.) I added a "write_char(0x00)" just before the asm code that updates all the registers and it seems to work fine.

There was also a problem in my timer code where I was sending the "end of interrupt" signals after I called the handler. This was causing problems because, when switching tasks, the timer handler never returns (by design I believe) and so the PIC's were never told that I received the interrupt. It seems like quite a silly mistake now that I have fixed it... :)
Post Reply