Page 1 of 7

Multitasking, SMP & Timers

Posted: Tue May 15, 2012 10:17 am
by AlfaOmega08
Hi, I'm looking for some suggestions to design a SMP scheduler. I have HPET and LAPIC Timer drivers ready, and APs being booted so far. In the past I wrote a simple scheduler using the PIT, but that was for a single processor system. It was pretty good, I think. It had priorities, choosed processes in a round-robin way, but let processes with higher priority run for a larger amount of time. Eg: if time slice is 5 ms, then a process with priority 10 would have runt 50 ms, and a process with priority 3, just for 15 ms. It also kept a list of sleeping processes, with timeouts. Basically I subtracted the slices from the timeouts each time the scheduler was called and when the timeout got to 0, the threads were waked up.

My problems here:
1) I don't believe this would work in a SMP system.
2) I don't think my implementation of sleep was quite good. I read somewhere on the forum that some sort of one-shot timer might be used, interrupting one thread to wake up a sleeping one.

Things are even harder, now that I have the PIT, the LAPIC Timer, 1 Periodic timer from HPET and up to other 31 One Shot timers from HPET. (Eventually PM-timer will be implemented too).
How would you distribute the workload? Do you believe the HPET Periodic timer more appropriate than the LAPIC Timer for scheduling?

Should sleeping threads be "stuck" to the current AP? What if, using my old method, two APs call "schedule()" and they both subtract the time slice from the timeout and woke up the thread too early?

Waiting for your helpful considerations...

Re: Multitasking, SMP & Timers

Posted: Tue May 15, 2012 10:41 am
by Ready4Dis
The problem of two cores updating the same values, you really should be using atomic operations (lock's, mutex's, semaphores depending on what you're trying to do) for any variables/code that can be run by many cores at a time.

As for idle processes, having a count-down timer that triggers every so often should be fine as long as it can't be starved (also, might want to set a known maximum interval for idle processes, so you can 'reasonably' guarrantee that it will be called within a certain time frame so developers know if the process is able to be idle, or should go to a higher priority).

For sleeping threads, they should only ever have to be woken up if they are requested from another process. If I put my HDD thread to sleep, it doesn't need a keep alive, it just needs to be notified when someone needs disk access.

Of course, you're ideas can easily differ (and it's your OS, so do what makes sense for what you want your final product). There isn't a one size fits all approach, a real-time OS will have very differing requirements than a non RT OS. An OS that has MANY little threads vs. one with a few large threads will have different requirements. There are lots of variables that could make you take one path versus another, I have no idea what your design is, or what your expected outcome is, so without any further info, it's hard to give anything specific (even if i had a clue, it's still hard to give specifics to someone elses design).

It is considered good practice (well, depending on the platform of course), to try to keep a process on the same CPU that it was on last time (of course, if one is heavily loaded, and another idle, you should move some processes around for load balance). So, if it's a sleeping task, there is no reason to do anything with it, since it should sleep until needed. I would probably have a flag somewhere (some sort of 'hint' variable) to determine the ease of moving a process from one CPU to another. Some platforms this is a big deal (think NUMA), while others it isn't as big a deal (think multi-core CPU, or a single CPU with hyperthreading). Again, your design considerations are going to dictate how you go about it.

Re: Multitasking, SMP & Timers

Posted: Tue May 15, 2012 11:05 am
by bluemoon
AlfaOmega08 wrote:It also kept a list of sleeping processes, with timeouts. Basically I subtracted the slices from the timeouts each time the scheduler was called and when the timeout got to 0, the threads were waked up.
Check the wiki for delta queue. (Timed) Sleep process is inserted in a sorted way, and only the first process in the queue need to be checked. Insert is slow, but then it's using sleeping process's time and a sleeping process won't complain for slow. You may also use one-shot timer this way.

Re: Multitasking, SMP & Timers

Posted: Tue May 15, 2012 4:20 pm
by gerryg400
Do you believe the HPET Periodic timer more appropriate than the LAPIC Timer for scheduling?
Generally no. The LAPIC timer is perfect for local scheduling. It allows the cores to be scheduled separately. And a core with nothing to schedule can disable its LAPIC and save power until work arrives.
Should sleeping threads be "stuck" to the current AP?
Not stuck, but there are advantages to keeping threads on the same core. Generally when they are sleeping, threads are entirely (at least conceptually) removed from the scheduler and lose their association with a core. If a thread has slept for a long time and other threads have run in its core there is probably little use resuming it on the same core. If the thread has slept for a short time the cache may still contain its data. It might be worth sending it to the same core. Keep a thread local variable holding the last_core just in case.

Oh, and have a look at the timer-wheels algorithm. It's a little old fashioned now but they are an excellent idea in a small kernel.

Re: Multitasking, SMP & Timers

Posted: Wed May 16, 2012 1:12 am
by Brendan
Hi,
AlfaOmega08 wrote:It had priorities, choosed processes in a round-robin way, but let processes with higher priority run for a larger amount of time. Eg: if time slice is 5 ms, then a process with priority 10 would have runt 50 ms, and a process with priority 3, just for 15 ms.
This is what I call "variable time slice scheduling". It's bad because high priority tasks have to wait for all medium/low priority tasks to have their turn before they get any CPU time. Here's what happens when there's 10 low priority threads (A to J), 10 medium priority threads (K to T) and one extremely high priority thread (X), where each character represents 5 ms of time:

Code: Select all

AAABBBCCCDDDEEEFFFGGGHHHIIIJJJKLMNOPQRSTXXXXXXXXXX
You'll notice that the the high priority task has to wait for 200 ms while other tasks are using the CPU. With 100 "priority 3 tasks" it'd have to wait for 1500 ms. You can see how this is likely to cause problems if the high priority task is trying to respond to key presses or do smooth animation.

There's a variation that I call "variable frequency scheduling"; where each task runs for a fixed length of time, but high priority tasks run more often. For example, a priority 3 task might get 3 different 5 ms time slices and a priority 10 task might get 10 different 5 ms time slices. This might look like this:

Code: Select all

ABCDXEFGHXIJKLXMABCXDEFGXHIJNXOPABXCDEFXGHIJXQRSTX
The tasks are still getting the same amount of time as before, but now the high priority task only waits for 20 ms before it gets any CPU time (rather than 200 ms). With 100 "priority 3 tasks" it'd have to wait for 150 ms (rather than 1500 ms). This is better, but it increases overhead a little (time spent doing task switches) and probably still isn't good enough if the high priority task is trying to respond to key presses or do smooth animation when the system is under load.

The next step is to recognise that the difference between task priorities can be so large that lower priority tasks shouldn't get CPU time at all if higher priority tasks are able to run. For example, you could have a "priority infinity" task which preempt any lower priority task and run immediately, where no lower priority task gets any CPU time when a "priority infinity" task is able to run. If there's several "priority infinity" tasks they could share the CPU. You could also have "priority infinity + n" tasks, where different "priority infinity + n" tasks share the CPU somehow (in the same way that "priority n" tasks do).

In a similar way, you could have "priority negative infinity" tasks which are preempted by anything and never run if anything else is able to run. You could also have "priority negative infinity - n" tasks, where different "priority negative infinity - n" tasks share the CPU somehow.

Effectively, what we've done is create 3 separate "priority groups". You could have any number of "priority groups". Within each priority group you could use a different policy for scheduling. One group might use "variable frequency scheduling", one group might use "variable time slice scheduling", and other groups might use something completely different.

You could create different groups for different purposes. This allows you to choose the best method of scheduling for the specific type of tasks being scheduled.You might have soft real time tasks that use "earliest deadline first" scheduling and call that "scheduling policy 1". The second group (scheduling policy 2) might use "variable frequency scheduling" and be intended for tasks that handle user interaction (key presses, video, etc). The third group (scheduling policy 3) might use "variable time slice scheduling" and be intended for normal tasks. A fourth group might be for things that should only be done if there's nothing else to do (idle tasks) and might use "variable time slice scheduling" too.

For SMP, you don't want a global "scheduler lock", as many CPUs might be fighting for this lock at the same time (and wasting CPU time). The more CPUs there are the worse it would get - lock contention would cause poor scalability. If your scheduler has a lock for each CPU then the chance of CPUs fighting for the same lock is a lot less, and performance/scalability would improves. If your scheduler has 6 different locks per CPU (one for each scheduling policy) then there'd be even less lock contention and you'd improve performance/scalability even more.

For my last scheduler, there was 8 scheduling policies with 256 priorities within each scheduling policy. The first (highest) policy was cooperative scheduling and used 256 locks (and 256 FIFO queues), and the remaining 7 policies had one lock each. That's 263 locks per CPU. Even with 1000 CPUs the chance of lock contention was still very low.


Cheers,

Brendan

Re: Multitasking, SMP & Timers

Posted: Wed May 16, 2012 1:46 am
by AlfaOmega08
Ready4Dis wrote:The problem of two cores updating the same values, you really should be using atomic operations (lock's, mutex's, semaphores depending on what you're trying to do) for any variables/code that can be run by many cores at a time.
That won't solve my problem. If two cores stop executing a task and decide to subtract the time slice from the sleep time, they aquire the lock, both subtract the value (bad), and release the lock. What I meant by "stuck" sleeping threads is that only one AP should be able to decrement the sleep timeout.
Ready4Dis wrote:For sleeping threads, they should only ever have to be woken up if they are requested from another process. If I put my HDD thread to sleep, it doesn't need a keep alive, it just needs to be notified when someone needs disk access.
What if a thread calls sleep(ms). No other process is going to request a wake up for it...
Ready4Dis wrote:Of course, you're ideas can easily differ (and it's your OS, so do what makes sense for what you want your final product). There isn't a one size fits all approach, a real-time OS will have very differing requirements than a non RT OS. An OS that has MANY little threads vs. one with a few large threads will have different requirements. There are lots of variables that could make you take one path versus another, I have no idea what your design is, or what your expected outcome is, so without any further info, it's hard to give anything specific (even if i had a clue, it's still hard to give specifics to someone elses design).
I would like to make my OS generic and scalable, with a lot of user interactivity (you know, play movies, surf the web :-P). It will not be a real-time OS.
bluemoon wrote:Check the wiki for delta queue. (Timed) Sleep process is inserted in a sorted way, and only the first process in the queue need to be checked. Insert is slow, but then it's using sleeping process's time and a sleeping process won't complain for slow. You may also use one-shot timer this way.
I just fell in love for this structure... But insertion is O(n) for any case...
gerryg400 wrote:Generally no. The LAPIC timer is perfect for local scheduling. It allows the cores to be scheduled separately. And a core with nothing to schedule can disable its LAPIC and save power until work arrives.
I didn't consider Power Management... good point. What about this: using the LAPIC Timer for local scheduling, one of the HPET one-shot timers to wake up the next thread in the delta-queue and eventually one HPET Periodic timer just to keep date/time.
Brendan wrote:For SMP, you don't want a global "scheduler lock", as many CPUs might be fighting for this lock at the same time (and wasting CPU time). The more CPUs there are the worse it would get - lock contention would cause poor scalability. If your scheduler has a lock for each CPU then the chance of CPUs fighting for the same lock is a lot less, and performance/scalability would improves. If your scheduler has 6 different locks per CPU (one for each scheduling policy) then there'd be even less lock contention and you'd improve performance/scalability even more.

For my last scheduler, there was 8 scheduling policies with 256 priorities within each scheduling policy. The first (highest) policy was cooperative scheduling and used 256 locks (and 256 FIFO queues), and the remaining 7 policies had one lock each. That's 263 locks per CPU. Even with 1000 CPUs the chance of lock contention was still very low.
Wouldn't that need separate thread queues for each CPU? Also, what's the point of a scheduler lock for a single CPU? The same CPU can't execute twice the critical section. I tought scheduler locks were used to prevent other CPUs from reading/altering queues while one of the CPUs is reading/modifing it. Doesn't this require a global lock?

Re: Multitasking, SMP & Timers

Posted: Wed May 16, 2012 2:00 am
by Brendan
Hi,
gerryg400 wrote:
Do you believe the HPET Periodic timer more appropriate than the LAPIC Timer for scheduling?
Generally no. The LAPIC timer is perfect for local scheduling. It allows the cores to be scheduled separately. And a core with nothing to schedule can disable its LAPIC and save power until work arrives.
Using the local APIC timer for local scheduling is "almost perfect" (and is better than using any other timer).

If one CPU is running at 2 GHz and another CPU is running at 500 MHz (because it got too hot and needs to cool down, for example); then if 2 tasks have the same priority and one gets 10 ms on the first CPU, then the other should get 40 ms on the other (slower) CPU, so that they both get the same fair share of CPU time.

The best way to do this is probably to use a performance monitoring counter setup to count "instructions retired". The scheduler doesn't let a task run for a certain amount of time, but instead the scheduler lets a task run for a certain number of instructions. If one task runs for 1 million instructions on one CPU and another task runs for 1 million instructions on another CPU, then the tasks get exactly the same share of CPU time regardless of how much "wall clock" time the tasks actually consumed.

If performance monitoring counters can't be used, then you can try to keep track of how fast a CPU can currently execute instructions and use the local APIC timer to approximate "instructions retired". This isn't going to be as accurate (especially when things like hyper-threading and TurboBoost are involved), but it's still going to be better than scheduling " raw time".


Cheers,

Brendan

Re: Multitasking, SMP & Timers

Posted: Wed May 16, 2012 2:07 am
by AlfaOmega08
Brendan wrote:If one CPU is running at 2 GHz and another CPU is running at 500 MHz (because it got too hot and needs to cool down, for example); then if 2 tasks have the same priority and one gets 10 ms on the first CPU, then the other should get 40 ms on the other (slower) CPU, so that they both get the same fair share of CPU time.
Damn. What a mess!

I think I'm missing something on scheduler locks however... Aren't those supposed to protect queues from simultaneous CPUs access? Having a lock per CPU means having a queue per CPU, and thus, threads are stuck to a CPU?

Thanks

Re: Multitasking, SMP & Timers

Posted: Wed May 16, 2012 3:24 am
by Brendan
Hi,
AlfaOmega08 wrote:That won't solve my problem. If two cores stop executing a task and decide to subtract the time slice from the sleep time, they aquire the lock, both subtract the value (bad), and release the lock. What I meant by "stuck" sleeping threads is that only one AP should be able to decrement the sleep timeout.
If both CPUs acquire the same lock at the same time, then your lock is hideously broken. ;)

Now consider a loop that's used for animation or something:

Code: Select all

    while(running) {
        do_something();
        sleep( 100 ms );
    }
How often is "do_something()" called? If "do_something()" always takes 2 ms then it would be called every 102 ms. If "do_something()" takes anywhere from 1 ms to 20 ms then it might be called again after 101 ms or after 120 ms. If other tasks are running it might be called after 543 ms. It's a dodgy/inaccurate piece of crud (that would be continually "drifting"). How do you fix it?

Code: Select all

    next_time = now() + 100 ms;
    while(running) {
        do_something();
        sleep_until( next_time );
        next_time += 100 ms;
    }
Jitter caused by the CPU running other threads might mean that sometimes "do_something()" might be called after 150 ms and sometimes it might be called after 50 ms to catch up; but (as long as the CPU is fast enough) you can guarantee that "do_something()" will be called every 100 ms on average.

For this reason, your kernel needs "sleep_until( when );". You could implement a "sleep( duration )" function like this:

Code: Select all

void sleep( duration ) {
     sleep_until( now() + duration );
}
This also means that tasks sleep until "now() > task->wake_time"; and you don't have to decrease a "time remaining" value for every sleeping task every time the timer IRQ occurs.

This also makes the delta queue idea a bit redundant - you just use a list/queue that is sorted in order of wake time.

To reduce the "O(n)" insertion time, you can have buckets - one list for tasks that should wake up in the next 100 ms, one list for tasks that should wake up between 100 and 200 ms, one list for tasks that should wake up in 200 to 300 ms, etc. If you've got 128 lists/queues (representing 100 ms of time each) then it only covers 12.8 seconds. For delays larger than that you'd need a "longer than 12.8 seconds" list/queue too. Every 100 ms your timer IRQ handler would switch to the next list and discard the old (empty) list, then create a new list by splitting the "longer than 12.8 seconds" list.

This means faster insertion, more queues/lists, more locks, less lock contention, etc. In theory, this can also be "per CPU" (e.g. using local APIC timer for the timer IRQ) to further reduce insertion times; with some other timer (e.g. HPET) used as a backup for when the local APIC is turned off to save power (e.g. where you merge a CPU's queues with the HPET's global queue before turning the local APIC timer off).
AlfaOmega08 wrote:
Brendan wrote:For SMP, you don't want a global "scheduler lock", as many CPUs might be fighting for this lock at the same time (and wasting CPU time). The more CPUs there are the worse it would get - lock contention would cause poor scalability. If your scheduler has a lock for each CPU then the chance of CPUs fighting for the same lock is a lot less, and performance/scalability would improves. If your scheduler has 6 different locks per CPU (one for each scheduling policy) then there'd be even less lock contention and you'd improve performance/scalability even more.

For my last scheduler, there was 8 scheduling policies with 256 priorities within each scheduling policy. The first (highest) policy was cooperative scheduling and used 256 locks (and 256 FIFO queues), and the remaining 7 policies had one lock each. That's 263 locks per CPU. Even with 1000 CPUs the chance of lock contention was still very low.
Wouldn't that need separate thread queues for each CPU?
Yes. For example, for my last scheduler there were 263 queues for each CPU.
AlfaOmega08 wrote:Also, what's the point of a scheduler lock for a single CPU?
These are all "ready to run queues". When a task is actually given CPU time, it's removed from the "ready to run" queue (the task becomes "running" and therefore it's no longer "ready to run"). The scheduler decides which CPU a task should use when a task needs to be put back onto a "ready to run" queue. This is how load balancing works (e.g. one CPU might look at it's own queue and say "I'm too busy!" and put the task on a different CPU's queue).

A CPU can only take tasks off of its own "ready to run" queue; but because any CPU can insert a task into any other CPU's run queue you'd still need a lock for each queue.

Alternatively you could use IPIs. For example, if one CPU wants to put a task onto another CPU's queue, it could send an IPI to the other CPU to ask it to put the task on it's own queue. However, assuming low lock contention, an IPI is more expensive than locks; and because you need to tell the other CPU which task to put on its queue you'd have to store a "thread ID" somewhere and you'd probably need a lock to protect that (so a third CPU doesn't trash your "thread ID" value while you're sending the IPI).
AlfaOmega08 wrote:Doesn't this require a global lock?
It requires a global lock for each queue (but not one global lock for all queues).


Cheers,

Brendan

Re: Multitasking, SMP & Timers

Posted: Wed May 16, 2012 6:44 am
by Ready4Dis
Brendan has summed it up nicely. Yes, I did not mean a single lock for all threads, each cpu should have it's own list of threads and a method of transferring threads from one cpu to another cpu (or taking threads from another CPU).

Re: Multitasking, SMP & Timers

Posted: Wed May 16, 2012 12:23 pm
by AlfaOmega08
Brendan wrote:If one CPU is running at 2 GHz and another CPU is running at 500 MHz (because it got too hot and needs to cool down, for example); then if 2 tasks have the same priority and one gets 10 ms on the first CPU, then the other should get 40 ms on the other (slower) CPU, so that they both get the same fair share of CPU time.
Just realized that, if my knowledge isn't totaly wrong, the APIC timer runs at the same frequency of the CPU. So if I calculate the CPU frequency at boot, with full speed, and keep that as the frequency of the LAPIC Timer, if the CPU slows down, the timer will follow it, and threads will have more time.
Brendan wrote:If both CPUs acquire the same lock at the same time, then your lock is hideously broken.
Well I meant: CPU #0 aquires the lock, decrements, and releases, then CPU #1 aquires it again, decrements and releases. Resulting in a double decrement, where only one was needed.
Brendan wrote:For this reason, your kernel needs "sleep_until( when );". You could implement a "sleep( duration )" function like this:

Code: Select all

void sleep( duration ) {
     sleep_until( now() + duration );
}
This also means that tasks sleep until "now() > task->wake_time"; and you don't have to decrease a "time remaining" value for every sleeping task every time the timer IRQ occurs.
Using sleep_until, I don't even need a sleep_queue, right? Just having all the threads in a queue, and if now() > wakeup_time, then it can be runt. What about using the HPET One Shot Timer to move a thread from the sleep queue to the "ready to run" quque? I still don't get the point... sorry.

However, I'm reading the Silberschatz's OS Concepts chapter on multiprocessing. It tells that one of the most advanced scheduling algoritms is "multilevel feedback queue scheduling", where each process is classified as "system", "interactive", "background", etc. Each class is a queue with it's own scheduling algorithm. Each queue will run its own processes only when the the upper queues are out of processes. For example an interactive process has to wait for all the system's processes. Processes can even be moved around queues, changing their priority. In the multiprocessor section it also describes how a CPU can pull processes from other CPUs when they're idle.
It reminds me of your (Brendan's) scheduler. This has to be quite a hell to implement. Is this what you described?

But most important what scheduling algorithms would you implement for each queue? I believe that a FIFO, round robin or my "variable time slice scheduling", would be fairly enaugh for background processes.
For interactive processes I would divide this in two subcategories: interactions with multimedia (audio and/or videos) which need high responsiveness and that would be just under the system queue, and a non-multimedia interactive process (text editor, imaging editor, directory explorer). I believe that multimedia processes should have all the same (high) priority, so have equal time slices between them with round-robin (in the rare case that the user will watch two videos simultaneusly).

Text editors and such, will spend most of time waiting for keyboard & mouse events, and there's no point in let them busy waiting. So the simple-interactive queue, will be mostly empty I guess, and in the case that two processes are crunching some data, the "variable frequency scheduling" will suffice, where the process or window that has focus gains higher priority.

System processes will not wait for user interaction, but will often spend time in sleeps for disk access or god knows what. And in this case the "variable frequency scheduling" should be good too.

Well, since I can't believe that round robin is an optimal algorithm for anything but toy OS, please dissuade me from using it for background and multimedia process. Do you find flaws in my design?

Edit: also, do you believe that high priority task should stop execution of lower priority tasks when waking up? Eg, task #0 (system) is sleeping, task #1 (normal, interactive) is running. If task #0 is waked up by HPET on timeout, should it stop task #1 and take control of the CPU, or should it wait for the next timer IRQ? Excuse me for excessive curiosity but I'm trying to have a complete image of scheduling algorithms before trying to write one for my last OS.

Edit: phew, my 200th post was quite big...

Re: Multitasking, SMP & Timers

Posted: Wed May 16, 2012 12:47 pm
by Griwes
AlfaOmega08 wrote:
Brendan wrote:If both CPUs acquire the same lock at the same time, then your lock is hideously broken.
Well I meant: CPU #0 aquires the lock, decrements, and releases, then CPU #1 aquires it again, decrements and releases. Resulting in a double decrement, where only one was needed.
Then your algorithm, in general, is broken. Why would two CPUs try to decrement the same variable? Of course, when multiple cores will try to do the same thing, then the job done will be multiplied; that's why such job should be assigned to single CPU (i.e. each CPU decrements only counters in it's own queues).

Re: Multitasking, SMP & Timers

Posted: Wed May 16, 2012 12:54 pm
by AlfaOmega08
Griwes wrote:Then your algorithm, in general, is broken. Why would two CPUs try to decrement the same variable? Of course, when multiple cores will try to do the same thing, then the job done will be multiplied; that's why such job should be assigned to single CPU (i.e. each CPU decrements only counters in it's own queues).
My algorithm was designed for single CPU systems :) Where the CPU decremented the sleep timeout of each process and woke up those that had a timeout <= 0. I was just pointing out how this method is not good for multiple CPUs...

Re: Multitasking, SMP & Timers

Posted: Wed May 16, 2012 12:56 pm
by Griwes
AlfaOmega08 wrote:
Griwes wrote:Then your algorithm, in general, is broken. Why would two CPUs try to decrement the same variable? Of course, when multiple cores will try to do the same thing, then the job done will be multiplied; that's why such job should be assigned to single CPU (i.e. each CPU decrements only counters in it's own queues).
My algorithm was designed for single CPU systems :) Where the CPU decremented the sleep timeout of each process and woke up those that had a timeout <= 0. I was just pointing out how this method is not good for multiple CPUs...
It's good, as long as you keep list of sleeping processes per CPU*, not in one global queue - then CPU can decrease only it's own processes' counters and everybody's happy. However, sleep_until() is definitely more elegant solution.

* I mean, each process/thread is tied to a processor (with possibility to be moved, of course) and each core only keeps processes/threads tied to it in it's list.

Re: Multitasking, SMP & Timers

Posted: Wed May 16, 2012 3:01 pm
by gerryg400
do you believe that high priority task should stop execution of lower priority tasks when waking up? Eg, task #0 (system) is sleeping, task #1 (normal, interactive) is running. If task #0 is waked up by HPET on timeout, should it stop task #1 and take control of the CPU, or should it wait for the next timer IRQ?
Yes. task #0 should run immediately. On a multi-core system the scheduler could perhaps check whether another core could be used, although in an ideal world with an ideal scheduler, the HPET interrupt has occurred on the core running the lowest priority task already.