Task Switcher Logic Ponderings...

astrocrep · Post by **astrocrep** » Mon Jun 25, 2007 10:12 am

General Task Switching Practices...

I spent some time thinking about exactly how to implement my task switcher, and I have come up with a couple of key points and possible solutions.

1.) Pre-emptive Timer granularity. Current its default, meaning what something like 18 times a second. However, I have seem people using something as low as 1000 times per second. I am not looking for a real-time system, but I want to me able to multi-task quiet well.

2.) Queue Seperation. I am fairly unsure on this point, but I think it would be in the best interest to seperate the queues. This means, I would have 1 queue for all running processes, 1 queue for all sleeping process, and 1 queue for blocked process. Also as a footnote, by queue I mean double-linked-list.

3.) Sleeping Threads. When a Thread calls Sleep(mili-seconds) BigSleep(Seconds) or Pause(ticks) the functions will all wrap around a Yield thread function. What will happen is the function call will convert what ever value passed into timer ticks always going over instead of under (Meaning if a thread wants to sleep for 40 miliseconds, and I can only give it 35 or 45, it will sleep for 45). Further more, it will attach the ticks_to_sleep to the thread control structure and change the state to sleeping. Now comes the tricky part, decrementing the sleeping values. Who should do this? At first I thought the Scheduler should iterate through the sleeping queue and decrement the ticks_to_sleep, and also check to see if it is zero in which case it will move it to the ready queue. Then I thought that it would be to much logic in the scheduler and that would slow it down, this I came up with the idea of running a high priority task who's job would just be to decrement the sleeping tasks, and move to ready, and then yield back to the scheduler. Let me add this (just thought of it) Still using the granularity of 1000 / per. The smallest sleep increment is 10 ticks, so every ten times into the scheduler, we add the check sleepers task to the head of the queue. And then it will yield back.

4.) Scheduling Priority. I am looking for something simpler, this is a hobby os, and not a next gen platform, so a simple priority scheme should be fine. What I came up with is something simple like so. If my timer frequency was 1000 / second. Then a threads time to run would be determined as such. If a thread_priority was 5 It would be given 50 ticks to run. Simple formula: 100 - (priority * 10). I can then also hardcode a priority 0 (real-time) to run untill complete (Bad Idea and should never happen).

So basically my scheduler would look something as such.

Code: Select all

void Scheduler(OldEsp0)
{
 SchedulerTicks++;
 
 CurrentTask->TicksSlice--;
 
 Possibly loop through sleepers and tick_to_sleep-- (also check for 0 and move to ready)
 
 if (CurrentTask->TicksSlice == 0)
  Switch to the next process
  setup a new CurrentTask and set CurrentTask->TicksSlice = CurrentTask->TicksSliceSize
}

Also Adding a thread to the Scheduler the priority timeslice_size would be calculated at that time

What do you think?

What am I missing?

Thanks,
Rich

Kevin McGuire · Post by **Kevin McGuire** » Mon Jun 25, 2007 7:27 pm

From reading about the Linux scheduler and how it works over time it seem like this is a logical choice and is quite simple to implement.

The most obvious mechanism is a que set per priority, and a bitmap. The bitmap lets each bit represent one of the que sets, and also representing the priority. You execute the highest priority threads first and when expired you should naturally see the next highest priority tasks after the just expired ones. Once a entire que has expired all it's threads you set the appropriate bit to zero in the bitmap, and naturally select the next que set (priority group).

A thread is expired when it's time slice is used up in essence exactly what you are planning to do (above) by your description.

The bitmap could be a simple as: uint32_t g_k_sched_pbitmap. Where each bit represents a priority and this gives you thirty-two priorities. The que sets could be defined such as:

struct queSet
{
threadQue active;
threadQue expired;
};
queSet g_k_sched_pques[32];

The second most interesting feature of the Linux 2.6 scheduler from my reading is the interactive bonus. Most threads that are performing a interactive function such as providing a graphical user interface are going to do a lot of blocking. The idea is to compute a factor from the amount of time sleeping, and provide this as a temporary priority boost so that interactive tasks are not just limited to running only X number of times per second or such.

I am not sure if it would be better to use a sleeping que for each que set, or to use a single sleeping que for the entire processor/system.

Brendan · Post by **Brendan** » Mon Jun 25, 2007 7:31 pm

Hi,

astrocrep wrote:Further more, it will attach the ticks_to_sleep to the thread control structure and change the state to sleeping. Now comes the tricky part, decrementing the sleeping values. Who should do this? At first I thought the Scheduler should iterate through the sleeping queue and decrement the ticks_to_sleep, and also check to see if it is zero in which case it will move it to the ready queue. Then I thought that it would be to much logic in the scheduler and that would slow it down, this I came up with the idea of running a high priority task who's job would just be to decrement the sleeping tasks, and move to ready, and then yield back to the scheduler. Let me add this (just thought of it) Still using the granularity of 1000 / per. The smallest sleep increment is 10 ticks, so every ten times into the scheduler, we add the check sleepers task to the head of the queue. And then it will yield back.

Have you considered keeping track of when each sleeping task is meant to wake up, instead of keeping track of how much time they have left to sleep?

For example, imagine you've got a counter that keeps track of how many ticks since boot (where 1 tick is 1 ms). If a task decides to sleep for 1 second, you work out that it needs to sleep for 1000 ticks, then add the current tick counter (e.g. 44444) to get the time it should wake up (e.g. 45444). Then you can insert the thread into a queue of sleeping threads at the right spot, so that the queue is sorted in order of when threads should wake up.

Now all you need to do is check this queue when you increment the "ticks since boot" counter. If the first thread on the queue isn't ready to wake up, then none of the other threads will be either (because the queue is sorted). Otherwise you'd wake the first thread up and remove it from the queue, then check the next thread in the queue (and keep waking up threads until you find one that doesn't need to wake up).

Cheers,

Brendan

astrocrep · Post by **astrocrep** » Tue Jun 26, 2007 1:33 pm

Those are some really good ideas that I am going to implement thanks to both of you...

However, what a good timer interval? 1000 / second? 100 / second?

What I plan on doing is the following:

[HW Timer]->[SW Timer Handler]

The timer handler will see if the Active Task was given enough time... if it was, it will call the scheduler, else it will return back to the current process...

But whats a going interval?

Thanks,
Rich

Kevin McGuire · Post by **Kevin McGuire** » Tue Jun 26, 2007 2:55 pm

I have never implemented a advanced scheduler in my kernel ( ... one of the many version and rewrites over the years). I think this might be a good idea to try.

There are a lot of articles and more proper technical terms, but to show a aspect of the threads that is important to a scheduler I can depict the (below):

[Low Blocking]<------------------------->[High Blocking]

Generally, from my knowledge, the far left of the spectrum is Low Blocking threads. These should be threads that spend almost all of their time doing computations in the CPU registers and RAM. A example of these types of threads could be associated with long cryptographic computations, scientific simulations, simulations in general, and ect. These are threads that will do no blocking (or in reality) hardly any blocking relative to other threads in the system. They need no interaction, disk I/O, or network I/O.

On the right side of the spectrum we have High Blocking threads. These should be threads that spend almost all of their time doing extremely short computations and in-between waiting on user interface events, network events, disk events.

Some threads may vary widely between the left and right of the spectrum since some may do a networking I/O that is short and quick, while others do lengthy transactions and such.

S = 1 / C
The time slice can be S, and the number of threads can b C. This gives each thread a equal amount of time per second to execute. A thread does not have to use the entire slice since it may go to sleep. The idea should be that interactive threads will go to sleep more and wake up in-between other threads begin continued and paused (switching).

This should allow a interactive thread to have the same time slice S, but more spread out over the second of time giving it the ability to handle interactive events better.

If you implement a mechanism that can boost thread that for doing less blocking then eventually Low Blocking threads should go ahead and execute first during a scheduler's cycle and get out of the way.. while later on during the second you should end up with a handful of interactive threads that are constantly waking up, sleeping, waking up, and sleeping.

astrocrep · Post by **astrocrep** » Tue Jun 26, 2007 3:16 pm

Humm... thats a lot of good ideas!!!

But lastly... what should the tick interval be?

I like that idea... that when a thread goes to sleep, it won't be penalized for its timeslice. So when it wakes up, it will be given more time to finish up...

Thanks
Rich

Brendan · Post by **Brendan** » Tue Jun 26, 2007 3:18 pm

Hi,

astrocrep wrote:However, what a good timer interval? 1000 / second? 100 / second?

Selecting the timer interval is a big compromise between overhead and precision.

For a slow CPU and a very fast timer you could spend half of CPU time servicing timer IRQs, which is a lot of overhead. With a very slow timer you'd spend much less time servicing timer IRQs, so the overhead is less.

However, if you're using the timer to measure time delays, then a very fast timer gives you much better precision. For example, imagine trying to measure 10 ms when the timer generates an IRQ every second - you can't do it. With one IRQ per second, you can't even measure 1 second accurately. For example, consider something like this:

Code: Select all

     startTime = getCurrentTimeTick();
     doSomething();
     endTime = getCurrentTimeTick();
     printf("Something took %d seconds\n", endTime - startTime);

In this case (with one IRQ per second), if it reports that "doSomething()" took 1 second, then you won't know if it actually took 0.00001 seconds or 1.99999 seconds or anything inbetween.

To make this compromise worse, there can be a huge difference between CPU speeds. For example, with a 1 ms time interval you'd have relatively high overhead on a 33 MHz CPU and you could easily afford to have better precision on a 3 GHz CPU.

One method is to dynamically select the timer interval, so that you don't get high overhead on slow CPUs but don't sacrifice precision on fast CPUs.

For example, if you work out that your IRQ handler will take 100 cycles on average, then (during boot) you could measure how fast your CPU is. Then you could do something like:

time_interval = IRQ_cost / (K * CPU_speed )

In this case K could be anything between 0 and 1, where 0 gives no overhead and no precision, and 1 gives maximum overhead and maximum precision. I'd probably go for "K = 0.001" (or roughly 0.1% overhead).

For example, if the IRQ handler costs an average of 100 cycles and the CPU does 2 billion cycles per second, then:

timer_interval = 100 / (0.001 * 2000000000)
= 1 / 20000
= 50 us

For another example, if the IRQ handler still costs an average of 100 cycles and the CPU does 25 million cycles per second, then:

timer_interval = 100 / (0.001 * 25000000)
= 1 / 250
= 4000 us

This might not be the "best" formula to use though - it gives constant overhead where CPU speed determines precision, which is the opposite of using the same timer interval for all computers (constant precision where CPU speed determines overhead).

If you're thinking of doing dynamic timer interval calculation, then I'd recommend playing with a spreadsheet for a while and finding a formula you like...

Cheers,

Brendan

bewing · Post by **bewing** » Tue Jun 26, 2007 4:43 pm

I'd suggest giving your superuser a "slider" so they can adjust the responsivity of their own system themselves.

astrocrep · Post by **astrocrep** » Wed Jun 27, 2007 8:49 am

I like the idea of a slider down the road...

Truth be told... my os will never run on a 386... it will always be running on new hardware... when (if) I ever run it on hardware... it will be modern.

So for the time being, I just set the frequency to 1000...

Question: Should my scheduler use a mutex lock? (What I think that would be really bad, because A Mutex Lock will Yield the thread (return it to the scheduler) if it has to wait... so that would create a HUGE mess...

Thanks,
Rich

frank · Post by **frank** » Wed Jun 27, 2007 9:13 am

Well if you were to use a mutex lock, you could, instead of calling yield, just return from the scheduler and not do anything if the lock was taken. That would return you to the thread that must have the lock and allow it continue execution to the end.

Brendan · Post by **Brendan** » Wed Jun 27, 2007 7:14 pm

Hi,

astrocrep wrote:I like the idea of a slider down the road...

Truth be told... my os will never run on a 386... it will always be running on new hardware... when (if) I ever run it on hardware... it will be modern.

That doesn't change too much. If you've got a slider you still need a "good" default, as most people won't touch the slider (and some people will constantly be playing with it without really knowing what they're doing

).

For modern computers, take a look at what's available now - there's still a performance difference between a top of the line CPU being used with fast RAM, and an "economy" mobile processor with smaller caches using slow RAM in a laptop.

For me, I dynamically determine the (CMOS/RTC) timer speed during boot (which is only used to keep track of real time), not scheduling.

My scheduler determines the maximum time slice length to use at each task switch, which can be based on current CPU speed (which changes due to power management), the priority of the task being switched to, the rest of the load on the CPU, etc. For this I use the local APIC timer (or the PIT if there's no local APIC) in "one-shot" mode to improve scheduler precision and reduce overhead.

For example, the scheduler can set the timer to generate an IRQ after 100 ms for one task, then set the timer to generate an IRQ after 10 ns for the next task. In this case it gets a maximum of 2 IRQs (if the tasks don't block and aren't preempted). For "periodic" timing (with a 1 ms timer interval), for the first task the scheduler gets 100 IRQs where 99 of them aren't needed (which messes up the CPUs instruction pipeline, etc), and for the second task the scheduler has to give the task 1 ms (instead of 10 ns) due to lack of precision.

It also means that when the CPU is completely idle (e.g. doing HLT while waiting for something to happen) it isn't being repeatedly woken up by the scheduler's IRQ (the scheduler can have no time slice length limit in this case), which can help to reduces power consumption.

astrocrep wrote:Question: Should my scheduler use a mutex lock? (What I think that would be really bad, because A Mutex Lock will Yield the thread (return it to the scheduler) if it has to wait... so that would create a HUGE mess...

For the lock itself, use a spinlock that disables IRQs. For example:

Code: Select all

    cli
    lock bts [schedulerLock],0
    jnc .gotLock
    sti

.wait:
    pause
    test dword [schedulerLock],1
    jne .wait

    cli
    lock bts [schedulerLock],0
    jnc .gotLock
    sti
    jmp .wait

.gotLock:

I'd split the scheduler into 2 parts. The first part decides which task to switch to, and the second part switches to a selected task. This is good when a higher priority task becomes "ready to run" and preempts a lower priority task, as you can completely skip the first part that decides which task to switch to, and switch directly to the higher priority task.

For the part that switches to a selected task, assume that the caller has already acquired the lock, switch to the new task, then free the lock.

For the part that decides which task to switch to there's 2 methods. The first method is to get a lock, select which task to switch to and call the function that does the actual task switch (and frees the lock). For e.g:

Code: Select all

switchToNextTask() {
    TASKID newTask;

    getSchedulerLock();
    newTask = /* choose a new task */
    switchToSpecificTask(newTask);
}

The second method is to not lock anything, select a task to switch to, then get the lock and check if the data you relied on has changed. If the data has changed you'd free the lock and retry. If the data didn't change you've already got the lock and just call the function that does the actual task switch (and frees the lock). For e.g:

Code: Select all

switchToNextTask() {
    int accessedCounter;
    TASKID newTask;

    for(;;) {
        accessedCounter = schedulerQueueUpdateNumber;

        newTask = /* choose a new task */

        getSchedulerLock();
        if(accessedCounter == schedulerQueueUpdateNumber) break;
        freeSchedulerLock();
    }
    switchToSpecificTask(newTask);
}

The first method is much easier, but leaves IRQs disabled for longer periods of time. The second method reduces the amount of time IRQs are disabled, but it can be very difficult to get everything right, as the data you're using to select which task to run next can change while you're reading it, and you need to be extremely careful when writing code that modifies this data.

If the code to select which task to run next is very fast (e.g. "O(1)") then forget about the second method. If there's some reason why the code to select which task to run next isn't very fast (or can't be changed to something that's very fast), then I'd recommend using the first method to begin with (and then consider upgrading to the second method later on, when everything else in your scheduler and all your IPC code is complete).

Cheers,

Brendan

OSDev.org

Task Switcher Logic Ponderings...

Task Switcher Logic Ponderings...

Re: Task Switcher Logic Ponderings...