Round-Robin

HardEnough · Post by **HardEnough** » Thu Aug 18, 2005 10:40 am

hi guys
if someone wants to use the round-robin scheduling algorithm, is it going to be buggy in the stuff that needs real-time, such as the videos and sounds ???

DruG5t0r3 · Post by **DruG5t0r3** » Thu Aug 18, 2005 10:58 am

well...it can be round-robin sort of with preemptiveness and work fine. But, just hard core round-robin, you would run in problems.

Pype.Clicker · Post by **Pype.Clicker** » Thu Aug 18, 2005 12:47 pm

if you mean "round robin" such as "put newly waken threads at the tail of the scheduler queue and process what's at the head of the queue", then yes, it's gonna doom any multimedia needs as soon as you have a cpu-bound task running (e.g. a compilation) unless you have sufficiently small time quantum so that

(max # of live process) x (time quantum) < (max delay affordable between two slice for the multimedia stuff)

but that usually leads to
- a time quantum so small that you virtually can't afford it because of switch overhead
- a maximum number of live process so small that you can't do any real "background" business

On my side, i have two queues, one being prioritized and having a very small quantum for threads that have just waken up and another one that have usual time quantum. It doesn't work too bad so far and it dramatically speeds up e.g. response to mouse event in the screen server.

HardEnough · Post by **HardEnough** » Thu Aug 18, 2005 7:06 pm

yep, i mean that there is no priority all processes are the same all having the same nice.
so i need to decrease the number of max processes in order to allow the video to run smooth or i've to decrease the quantum to allow fast switching.
what is the quantum that video needs so it run smooth ?
is it less than 100 msec or more ?

Brendan · Post by **Brendan** » Thu Aug 18, 2005 11:55 pm

Hi,

HardEnough wrote:yep, i mean that there is no priority all processes are the same all having the same nice.
so i need to decrease the number of max processes in order to allow the video to run smooth or i've to decrease the quantum to allow fast switching.
what is the quantum that video needs so it run smooth ?
is it less than 100 msec or more ?

For "smooth" video you'd want to make sure that the display can be updated in less than 30 mS (for e.g. 30 mS from when the user presses a key until the user can see the change on the screen). If you're using a 1 mS quantum but running 100 tasks then it might take 99 mS before a task responsible for sending video data to the video driver gets any CPU time.

If you're thinking of running device drivers as seperate tasks, then it'll never work (you won't be able to respond to the devices quickly enough), and unless you can guarantee that there's never more than 30 tasks running (at 1 mS each) a GUI will look sluggish.

If you use a variable quantum (where the amount of time each task gets depends on it's priority) you'd have the same problem - "responsive-ness" will still suck.

One solution would be to reduce the quantum to something extremely small - for e.g. divide the desired response time by the maximum number of tasks that could be running. In this case you can easily end up with a quantum that is too fast for the CPU to handle (ie. the CPU spending most of the time doing task switches).

IMHO it's impossible to get good "responsive-ness" (under load) without allowing some tasks to take precedence over others - for e.g. if task A or task B can run then task C, task D and task E will not get any CPU time. There's a lot of different ways this can be done. The simplest method would be "the highest priority task that can run does run", which can have other problems (CPU hogs and deciding what priority each task should have).

For a more complex example, my OS has 4 seperate groups of tasks called "policies". Each thread has a policy and a priority. Each policy takes precedence over lower policies and has it's own seperate scheduling algorithm. Each policy is intended for a specific use - policy0 = device drivers, policy1 = normal threads that need fast response times, policy2 = normal threads and policy3 = idle threads.

The idea is that a typically application would use one or more policy 2 and/or policy 3 threads to do actual work, and then have a policy 1 thread for the user interface. This way when a user presses a key or moves the mouse it's handled almost immediately by the policy 1 thread and the user interface is always responsive (I can have a thousand threads running in the lowest 2 policies without having any effect at all on the "responsive-ness" of the higher 2 policies, and device driver threads are always extremely responsive).

Of course this all depends one what your OS is being used for. For a specific embedded system round robin might be appropriate, or possibly a simple OS where only one application can be running at a time (an OS designed for games, perhaps). In all other cases pure round robin is IMHO ugly and unsuitable (which arguably includes "hobby OSs" where the intent is to learn, if you want to learn well).

Cheers,

Brendan

Solar · Post by **Solar** » Fri Aug 19, 2005 1:34 am

My most favourite OS of all times (let's see if I can get through this without actually naming it...) used a round-robin with static priorities: CPU was scheduled to the runable task with the highest priority. If several runable tasks had the same priority, CPU was round-robined among them.

From that point on, it was all a matter of passing out sensible values to the system tasks. User tasks had a priority of zero (range being 127 to -128) by default; setting to higher than 5 was discouraged as that was what the GUI ran at (IIRC).

That system worked very efficiently, considering the hardware, and was well-known for its multimedia prowess. (Just set multimedia apps to 1 and you're safe from all the number-crunching / compiler tasks...)

However, that system also "benefitted" from not having any memory protection (i.e., rapid context switches and extremly efficient messaging).

Solar · Post by **Solar** » Fri Aug 19, 2005 1:35 am

I did it! I wrote that post without ever mentioning "Amiga"!

Ups...

;D

HardEnough · Post by **HardEnough** » Fri Aug 19, 2005 3:42 am

so whats is the best scheduling algorithm that won't take so much time when context switching takes place and allows real-time applications to run smoothly ?

Solar · Post by **Solar** » Fri Aug 19, 2005 5:11 am

HardEnough wrote: so whats is the best scheduling algorithm that won't take so much time when context switching takes place and allows real-time applications to run smoothly ?

Solar's Real Time Bot - running "real time" means that the OS guarantees certain maximum response / turnaround times, and cries bloody murder if it cannot do so for some reason. People frequently confuse "real time" with "good multimedia performance" - the latter does not require an OS to be "real time", just to be efficient with low latencies. (The occassional skipped frame doesn't really matter, whereas a real time system won't even start the video player if there is a chance that it has to skip a frame.)

So you are asking for the "optimum" scheduling algorithm when the task at hand is multimedia? Well, that's the subject of university courses, corporate research, personal opinion and much religion...

HardEnough · Post by **HardEnough** » Fri Aug 19, 2005 6:32 am

all of my experience is the Dr.Tanenbaum book "operating systems design and implementation", and now reading the "design and implementation of 4.4BSD".
and i found that most of scheduling algorithms are old some from 60s and 70s , so i'm asking for your own way in doing this stuff

Solar · Post by **Solar** » Fri Aug 19, 2005 6:54 am

Ingo Molnar implemented a scheduler for Linux that does scheduling in O(1), i.e. takes a constant time to decide which task to run no matter how many tasks there are in the system.

And I was impressed by the idea of "directed yields", i.e. one task yielding to another task (i.e., increasing that task's priority or getting it to run the rest of the timeslice or something) because the yielding task "knows" that it cannot run further before that other tasks finishes.

Other than that I am not aware of any "breakthrough" in the art of scheduling. You have priorities or not; you adapt them dynamically depending on runtime behaviour or leave them static; you have one or more ready queues; you do special stuff to tasks using up their timeslices or not... it's like with programming languages: Many new ones appeared, but in the end there hasn't been much fundamentally new stuff.

Brendan · Post by **Brendan** » Fri Aug 19, 2005 8:45 am

Hi,

Solar wrote:Ingo Molnar implemented a scheduler for Linux that does scheduling in O(1), i.e. takes a constant time to decide which task to run no matter how many tasks there are in the system.

I'm not familiar with Ingo's/Linux's new scheduler, but it's worth considering the performance of the code for deciding what runs next, and the code needed when a thread becomes ready to run, and any other code in the scheduler's path. For example, it'd be possible (for an in-efficient algorithm) to shift a lot of work from the "deciding what runs next" code into the "thread becomes ready to run" code and end up with an "O(1) when deciding what runs next" scheduler that performs very badly due to horribly inefficient "when a thread becomes ready to run" code.

HardEnough wrote: all of my experience is the Dr.Tanenbaum book "operating systems design and implementation", and now reading the "design and implementation of 4.4BSD".
and i found that most of scheduling algorithms are old some from 60s and 70s , so i'm asking for your own way in doing this stuff

My way was to have 4 seperate scheduling policies, with a (theoretically) different scheduling algorithm for each policy. The first policy was an O(1) "the highest priority task that can run does run" scheduler, while the remaining policies used a "variable frequency" scheduling policy that was meant to reduce the amount of time a thread needed to wait before getting CPU time (as compared to variable time slice). Unfortunately, in the worst case, the "variable frequency" algorithm was O(2*N). I did think of a way to make it O(1) but the resulting complexity was objectionable.

A recent attempt to replace the variable frequency parts of the scheduler with more efficient algorithms didn't work, as the scheduling code is too embedded into everything else (the kernel wasn't modular enough to make replacement easy). Since then I've decided split the micro-kernel into several seperate "kernel modules", so I can have several different "scheduler" modules and choose which to use when the OS is installed. It seems like a small reason for a large change, but there are other factors (see http://bcos.hopto.org/cgi-bin/forum.cgi?B2 if you're curious).

In any case, my approach was to split things into catagories that correspond to expected "CPU load characteristics". For example, I run device drivers at CPL=3 as normal threads and need very high priority threads that get CPU time as soon as possible for the IRQ handlers. This became "policy0", using a "the highest priority task that can run does run" scheduler. Then there's threads that interact with the user (or threads that need fast response times but don't really hog the CPU) that became Policy1. For worker threads (heavy calculation, compiling, etc, where CPU time is needed but fast response times aren't important) I use Policy2, and for things that should only be done during idle time I use Policy3. I don't support "real time" threads, as I can't think of a worthwile reason for them (as opposed to "best effort") in a desktop/server OS where the hardware gives no timing guarantees anyway.

My advice is that if you can categorize things into expected "CPU load characteristics" it makes it easy to decide how your scheduler should treat each category, and how each catagory should interact with other catagories. I'd also suggest making things modular so that you can replace your scheduler easily later if you need to (you'd think I'd have learnt that by now

).

Cheers,

Brendan

HardEnough · Post by **HardEnough** » Fri Aug 19, 2005 12:40 pm

ur idea is simple & great, i don't why, even when linux 2.6 used the O(1), the BSD is much faster ?
i run freebsd 5.4 custom built kernel from cvs with 4.4BSD schedular and it runs faster than slackware 10.1 with custom built 2.6 kernel which ofcourse faster than winxp.
i think it the swapping because i listen to the sound of hard disk more with linux than freebsd.
i've p4 2.6 GHz with 256 RAM.

Pype.Clicker · Post by **Pype.Clicker** » Sat Aug 20, 2005 3:35 pm

the scheduler is of course not the only thing that has to be taken into account when considering the "speed" of a system: memory load (and thus how frequently the system will have to swap), disk caching/swapping algorithms, implementation of system calls (using INT nn or SYSENTER), optimization of the kernel towards SMP or single processor systems, etc. can be important too.

Not mentionning the "heaviness" of the GUI: turn off the GPU acceleration or try to redraw the content of a window while you resize/move that window and your "fastest" system suddenly looks slowed down...

distantvoices · Post by **distantvoices** » Sun Aug 21, 2005 4:37 am

I'd also have a look at: which task *needs* to run prior to an other one to get work done properly. I reckon drawing a call path pyramid isn't a bad thing per se. Tim Robinson has once stated: Round robin just occurs once in a while - Mostatime a task blocks because it wants IO done or something else.

Just my opinion. Better concentrate on having your task priorities set in a good order so the system runs smoothly than tackle around too much with round robin - which is as simple ans straight forward as taking the current runnable task upon time slice expiration, put it on the expired queue and take the next one. Voila.

@Pype: Salut! Ca gaze? regarding the move-window stuff: I'm still experimenting with it. Currently I keep the moved window in a separate buffer which I blit on screen upon dragging the thing. Kinda funny, but still not what I want. It's still sloppy implemented - need to think about it toroughly.

Or use an outline rect.

OSDev.org

Round-Robin

Round-Robin

Re:Round-Robin

Re:Round-Robin

Re:Round-Robin

Re:Round-Robin

Re:Round-Robin

Re:Round-Robin

Re:Round-Robin

Re:Round-Robin

Re:Round-Robin

Re:Round-Robin

Re:Round-Robin

Re:Round-Robin

Re:Round-Robin

Re:Round-Robin