Troubles implementing multitasking

iansjack · Post by **iansjack** » Mon Aug 06, 2018 12:32 pm

I think you're going to have trouble if you don't restore the appropriate cr3 and stack pointer for the task (assuming you are using paging).

TheCool1Kevin · Post by **TheCool1Kevin** » Mon Aug 06, 2018 2:21 pm

Oops there must've been a misunderstanding. I was saying that iret will not restore ESP and CRx registers for you (although there IS an ESP register pushed automatically by the CPU, it does not get restored).

Brendan · Post by **Brendan** » Mon Aug 06, 2018 9:46 pm

Hi,

frabert wrote:But if I change the esp before the iret, what is iret going to pop into the eflags and all of that? Or am I supposed to save all of that inside the task's stack too?

TheCool1Kevin wrote:Then to switch tasks, I switch the ESP/page directories and jump to "irq_return" and let the cpu handle the rest. And so to run the task, I set up the task's stack and queue it, where the scheduler then changes the stack and the cpu pops the return address and runs the code. Not sure if it's THE way, but it works and so far nothing bad has happened.
Looking at your code, you do almost the same thing, except that your "switchTasks" works differently. It can't be that bad.

Including a different topic from someone else, that's 3 people suffering from the same idiocy in the space of about a week.

I need to find out where all this wrongness comes from.

Is there a tutorial somewhere that is misinforming far too many people (making them think that IRQs and "IRET" has something to do with task switching), or is it it coming from a dodgy page in the OSdev wiki, or...? If I can find the root cause, then maybe I can fix that and prevent people from getting multitasking wrong in future.

Note: For almost all kernels (that use "kernel stack per task" and don't use "kernel stack per CPU"); privilege level changes (including IRET changing from CPL=0 to CPL=3) have nothing to do with task switches, and task switch code never has anything to do with IRET.

EDIT: I checked wiki pages and found 2 that are fine and one page that had a simple example that involved IRET, so I've updated the effected page in the hope of making things a little clearer. I very much doubt that this is the true/only source of the problem.

Cheers,

Brendan

frabert · Post by **frabert** » Tue Aug 07, 2018 1:16 am

@Brendan I can only speak for myself, of course, but the reason why I am lead to believe that iret is the key is that task switching is generally caused by an exception/interrupt, thus must be done in an interrupt handler. The correct way to signal the CPU we're done with the handler is, of course, to issue an iret, and the iret handily provides ways to change the eip to which to resume execution. The standard ret also has this, but in the places where ret should be used instead of iret, there is no way to save the registers of the task being put to sleep.

iansjack · Post by **iansjack** » Tue Aug 07, 2018 2:03 am

Brendan wrote: Note: For almost all kernels ... privilege level changes (including IRET changing from CPL=0 to CPL=3) have nothing to do with task switches..
Brendan

Brenden

Perhaps it would make things clearer if you could explain how cr3 is loaded with its new value without a privilege level change. And there are surely other kernel structures and variables that need to be updated when a task switch occurs. What is the recommended way to achieve this without changing the privilege level?

Brendan · Post by **Brendan** » Tue Aug 07, 2018 2:05 am

Hi,

frabert wrote:@Brendan I can only speak for myself, of course, but the reason why I am lead to believe that iret is the key is that task switching is generally caused by an exception/interrupt, thus must be done in an interrupt handler.

Why do you think that?

If a task has to wait to acquire a mutex, the thread blocks causing a task switch (and no interrupt is involved). If a task calls "sleep()" it blocks causing a task switch (and no interrupt is involved until the task wakes up again). If a higher priority task is unblocked for various reasons (because it was waiting for a mutex, waiting for data to arrive via. a pipe or message from another process, ...) it may/should preempt causing a task switch (and no interrupt is involved).

For all cases where an interrupt (IRQ or exception) is involved it's a sequence of separate steps - the interrupt happens (possibly causing a privilege level switch from user to kernel) and this does not directly have anything to do with task switching, and then the kernel's code might or might not decide to do a task switch but if it does this has nothing directly to do with the interrupt, and then (eventually) the interrupt handler returns and this does not directly have anything to do with task switching.

frabert wrote:The correct way to signal the CPU we're done with the handler is, of course, to issue an iret, and the iret handily provides ways to change the eip to which to resume execution. The standard ret also has this, but in the places where ret should be used instead of iret, there is no way to save the registers of the task being put to sleep.

Sure - IRET is the correct way to return from an interrupt handler (and this does not directly have anything to do with task switching); and RETF is the the correct way to return from a call gate handler (and this does not directly have anything to do with task switching); and SYSEXIT is the the correct way to return from a SYSENTER handler (and this does not directly have anything to do with task switching); and SYSRET is the the correct way to return from a SYSCALL handler (and this does not directly have anything to do with task switching).

Cheers,

Brendan

Brendan · Post by **Brendan** » Tue Aug 07, 2018 2:19 am

Hi,

iansjack wrote:
Brendan wrote: Note: For almost all kernels ... privilege level changes (including IRET changing from CPL=0 to CPL=3) have nothing to do with task switches..
Perhaps it would make things clearer if you could explain how cr3 is loaded with its new value without a privilege level change. And there are surely other kernel structures and variables that need to be updated when a task switch occurs. What is the recommended way to achieve this without changing the privilege level?

In general (for "kernel stack per task") it goes like this:

Something (IRQ, exception, system call, ...) causes a privilege level change from user to kernel (that has nothing to do with task switching)
Kernel's code decides it feels like doing a task switch (after CPU is running kernel code already)
Kernel does task switch (where no privilege level change is needed because CPU was running kernel code already)
Kernel returns to user space (and this has nothing to do with task switching).

Note that because the CPU is always running kernel code immediately before any task switch, the CPU will also always be running kernel code immediately after any task switch.

Of course for "kernel tasks" the irrelevant and unrelated steps (involving privilege level changes) simply don't exist.

Cheers,

Brendan

iansjack · Post by **iansjack** » Tue Aug 07, 2018 2:38 am

I think it may just be a case of detail. To my mind we have

1. A task is running (in user mode).

2. At some stage the task decides to relinquish control.

3. Some procedure now happens which ipso facto involves a switch to supervisor mode. That procedure carries to the details of a task switch, and eventually returns to a user task (normally a different one).

Alternatively 2 and 3 may be replaced by a time-slice switch, but that's probably the exception rather than the rule.

4. A task is running (in user mode).

I think, as far as most people are concerned, this means that a task switch involves a change of privilege level. 1 and 4 are the salient stages, with 2 and 3 being a black-box that can be implemented in different ways. But all of those ways will involve a switch to supervisor mode. So to say that "privilege level changes (including IRET changing from CPL=0 to CPL=3) have nothing to do with task switches", whilst strictly true, is perhaps a little misleading and is liable to confuse beginners. Of course the processor is in supervisor mode immediately before the task switch, and immediately after, but that is too atomic a viewpoint.

Brendan · Post by **Brendan** » Tue Aug 07, 2018 4:32 am

Hi,

iansjack wrote:I think it may just be a case of detail. To my mind we have

1. A task is running (in user mode).

2. At some stage the task decides to relinquish control.

3. Some procedure now happens which ipso facto involves a switch to supervisor mode. That procedure carries to the details of a task switch, and eventually returns to a user task (normally a different one).

Alternatively 2 and 3 may be replaced by a time-slice switch, but that's probably the exception rather than the rule.

4. A task is running (in user mode).

I think, as far as most people are concerned, this means that a task switch involves a change of privilege level. 1 and 4 are the salient stages, with 2 and 3 being a black-box that can be implemented in different ways.

No, there's too many cases where either there isn't any privilege level switch (e.g. IRQ interrupts kernel code) or there isn't any interrupt (e.g. system calls) or there's too much other work that happens before or after the task switch to consider them part of the same thing. Your way of thinking only works for one special case where there is both an IRQ and an privilege level switch, which (unfortunately, when combined with "round-robin") is often the first case that a poorly guided (or unguided) beginner sees. This leads them into believing in "wrong" and then getting horribly confused later when they can't get anything else to work properly.

iansjack wrote:But all of those ways will involve a switch to supervisor mode. So to say that "privilege level changes (including IRET changing from CPL=0 to CPL=3) have nothing to do with task switches", whilst strictly true, is perhaps a little misleading and is liable to confuse beginners. Of course the processor is in supervisor mode immediately before the task switch, and immediately after, but that is too atomic a viewpoint.

For a true beginner (someone with no prior knowledge of task switching); I normally teach them to implement a "switch_to_task()" (like the example I wrote here) and then test it by having two kernel tasks that explicitly switch to each other; and then build on that (add code to select a task before calling "switch_to_task()", add generic "block_task(reason)" and an "unblock_task(task)", etc). This is not confusing at all - it's a relatively natural progression (that has the added benefit of making it easy to test each step along the way).

Where there's confusion, it's always because someone has had their mind poisoned before they learn how to do things right, because doing it right conflicts with their pre-existing misconceptions (and "task switching always involves interrupts" is the single biggest misconception).

Cheers,

Brendan

frabert · Post by **frabert** » Tue Aug 07, 2018 5:44 am

Thanks Brendan for your explanations, they really are helpful in understanding how all of this works! In my defense, in the OS course I took at uni it always seemed like task switches were made on interrupts, so... Dunno, maybe my profs were oversimplifying a bit

frabert · Post by **frabert** » Tue Aug 07, 2018 9:56 am

@Brendan I successfully implemented multitasking following your suggestions

Now forgive me if I'm starting to sound repetitive... I don't understand how to use the "switch_task" routine you mentioned in the Wiki to switch tasks during an interrupt. If I call it normally, everything works fine - the task is switched and all goes well. If I call it at the end of an interrupt handler, however, the task is switched, but iret never gets called, which is an issue since - for example - interrupts are never re-enabled. I though of making two procedures, one for calling inside handlers and one for general use, but the issue is that the stack of tasks suspended by an interrupt would not be compatible with those suspended by other factors...

Brendan · Post by **Brendan** » Wed Aug 08, 2018 9:42 pm

Hi,

frabert wrote:@Brendan I successfully implemented multitasking following your suggestions Now forgive me if I'm starting to sound repetitive... I don't understand how to use the "switch_task" routine you mentioned in the Wiki to switch tasks during an interrupt. If I call it normally, everything works fine - the task is switched and all goes well. If I call it at the end of an interrupt handler, however, the task is switched, but iret never gets called, which is an issue since - for example - interrupts are never re-enabled.

Interrupts are re-enabled.

TaskA disables IRQs and then does a task switch, and just after that whichever task the scheduler switched to enables IRQs.

Eventually some other task disables IRQs and tells the scheduler to do a task switch, and the scheduler gives CPU time back to TaskA, and TaskA does the IRET (which enables IRQs that were disabled by some other task).

Cheers,

Brendan

frabert · Post by **frabert** » Fri Aug 10, 2018 4:10 am

Ok I feel like I'm finally starting to understand the bigger picture. Here is what I've understood, and the places I'm still confused about:

Code: Select all

proc LongTaskA:
  EnableInterrupts()
  ComputeLastDigitOfPi()
end

Code: Select all

proc LongTaskB:
  EnableInterrupts()
  ComputeLastDigitOfE()
end

Code: Select all

proc Schedule:
  DisableInterrupts()
  SwitchTasks(next_task)
  EnableInterrupts()
end

LongTaskA is started, and enables interrupts
Keyboard key is pressed, IRQ handler calls Schedule(), for some reason LongTaskB is started and enables interrupts
Time quant expires and Schedule() is called, LongTaskB is paused, LongTaskA resumes execution but does not re-enable interrupts, because it's in the middle of computing pi
LongTaskA is never descheduled, because it does not use kernel resources (mutexes, IO, anything) and interrupts have not been enabled

Is this correct? I feel it's not, there's something I must be missing...

Brendan · Post by **Brendan** » Fri Aug 10, 2018 4:45 am

Hi,

frabert wrote:Ok I feel like I'm finally starting to understand the bigger picture. Here is what I've understood, and the places I'm still confused about:
Code: Select all
proc LongTaskA:
  EnableInterrupts()
  ComputeLastDigitOfPi()
end
Code: Select all
proc LongTaskB:
  EnableInterrupts()
  ComputeLastDigitOfE()
end
Code: Select all
proc Schedule:
  DisableInterrupts()
  SwitchTasks(next_task)
  EnableInterrupts()
end
LongTaskA is started, and enables interrupts

Keyboard key is pressed, IRQ handler calls Schedule(), for some reason LongTaskB is started and enables interrupts

Time quant expires and Schedule() is called, LongTaskB is paused, LongTaskA resumes execution but does not re-enable interrupts, because it's in the middle of computing pi

LongTaskA is never descheduled, because it does not use kernel resources (mutexes, IO, anything) and interrupts have not been enabled
Is this correct? I feel it's not, there's something I must be missing...

It'd be more like:

LongTaskA is started, and enables interrupts
Keyboard key is pressed and keyboard IRQ handler interrupts LongTaskA, IRQ handler disables IRQs, then calls Schedule(), for some reason LongTaskB is started and enables interrupts
Time quant expires causing a timer IRQ. Timer IRQ handler disables IRQs, does some stuff, then calls "schedule()". LongTaskB is paused, LongTaskA resumes execution in the middle of the keyboard IRQ handler (just after it called "schedule()" when it was running last). The keyboard IRQ handler enables IRQs and returns to LongTaskA (which continues computing pi)
Time quant expires again causing annother timer IRQ. Timer IRQ handler disables IRQs, does some stuff, then calls "schedule()". LongTaskA is paused, LongTaskB resumes execution in the middle of the timer IRQ handler (just after it called "schedule()" when it was running last). The timer IRQ handler enables IRQs and returns to LongTaskB (which continues computing E)

Note: A few days ago (after trying to figure out why so many people make "all task switches involve IRQs" assumptions) I decided that the main problem was that there simply aren't many good tutorials that explain multi-tasking. To fix this, I've spent the last few days trying to write one. Currently it needs proof reading, and I'd expect that there's many mistakes and lots of areas that need to be improved; but if you're willing to be an unsuspecting test subject you can find it here.

Cheers,

Brendan

frabert · Post by **frabert** » Fri Aug 10, 2018 7:57 am

That's awesome, that's going on the read list right now

OSDev.org

Troubles implementing multitasking

Re: Troubles implementing multitasking

Re: Troubles implementing multitasking

Re: Troubles implementing multitasking

Re: Troubles implementing multitasking

Re: Troubles implementing multitasking

Re: Troubles implementing multitasking

Re: Troubles implementing multitasking

Re: Troubles implementing multitasking

Re: Troubles implementing multitasking

Re: Troubles implementing multitasking

Re: Troubles implementing multitasking

Re: Troubles implementing multitasking

Re: Troubles implementing multitasking

Re: Troubles implementing multitasking

Re: Troubles implementing multitasking