Hi,
Googles wrote:
What algorithms do you use in your CPU scheduling (for finding out which task should be loaded next), are you using a job scheduler too?
I schedule threads only (no job scheduling). Each thread belongs to a scheduling policy, and has a priority within this policy. The first policy (policy 0) is for threads that need very fast response times but don't use much CPU time - IRQ handlers, etc. Policy 0 threads use a co-operative scheduler (highest priority thread that can run does run).
The remaining policies are policy 1 (high), policy 2 (normal) and policy 3 (idle). All of these policies use variable frequency scheduling, for e.g. a thread with the highest possible priority gets 256 time slices in the same amount of time that a thread with the lowest possible priority would only get one time slice. The actual length of a time slice depends on which policy - time slices for policy 2 threads are twice as long as time slices for policy 1 threads, and time slices for policy 3 threads are twice as long as time slices for policy 2 threads (or 4 times longer than policy 1 time slices).
The scheduler always runs threads from the most important (numerically lowest) policy. Any thread may pre-empt the currently running thread if the currently running thread is in a less important policy. Any policy 0 thread will also pre-empt a lower priority policy 0 thread, but threads in other policies will not pre-empt threads in the same policy.
To prevent (buggy) policy 0 threads from locking up the entire computer, the time they are allowed is restricted. If a policy 0 thread exceeds this time limit (which is the same amount of time a policy 1 thread would get for it's time slices) it is deemed faulty and terminated. Because my OS is designed for everything from 80486 up there is a large CPU performance difference between the slowest and fastest computers that it could be run on. Therefore my OS does not use fixed time slices. Instead (during boot) it measures the actual performance of the CPU/s (which has nothing to do with CPU operating frequencies) and calculates time slice lengths from that. This reduces the overhead of thread switching on slow computers while improving the "smoothness" on fast computers.
Add to this some thread blocking: waiting for message, sleep, deep sleep, stopped, halted (crashed/debugging) and terminated. Then there's code to allow a thread to change it's policy and priority, it's name, etc. The OS also keeps track of how much time (in mS) each thread has used, and how much time each process has used in each policy (mostly for performance tuning, but it also allows users to see where all the CPU time is going).
The multi-CPU versions of the kernel also have CPU affinity (which is actually sub-domain affinity, or "physical chip affinity"), and NUMA domain affinity.
My scheduler has evolved over time to avoid common problems. Firstly the OS is intended for desktop/server, where real time features are (IMHO) NOT desirable (doing 95% of things very fast and 5% of things slowly is better than doing everything with quaranteed average speed). Purely co-operative schedulers are severely effected by CPU hogs, variable time slice schedulers perform badly under load (too much time between time slices), round-robin (no priorities) is always ugly. Variable frequency does have a bit more overhead though..
It's intended that policy 0 is used for "system processes" (device drivers, file system code, etc). Policy 1 is for anything related to the user (GUI, CLI and all user input and video/sound output code in applications), policy 2 is for normal work (compiling, sorting, converting, general processing), and policy 3 is for anything that can be done during idle time that may improve performance during non-idle time (disk defragging, scan-disk, cleaning/testing memory, spell-checking, pre-processing, etc). This way the computer can be completely overloaded without effecting the "responsiveness" of the GUI (I've found typical users don't understand the difference between overall performance and perceived/user interface performance).
Cheers,
Brendan