Hi,
t6q4 wrote:Which do you prefer - TSS-based or software-based? I thought I would hold a poll to see which is the most popular choice to see what I should implement.
I used to use hardware task switching when I was first starting out, but I've used software task switching ever since.
Hardware task switching is a little more "bullet-proof", especially for things like the double fault exception handler (however, it's better not to have any double faults to start with). It is possible to have any number of tasks for hardware task switching (just modify GDT descriptors during the task switches).
[@pcmattman: Yes, I have done this and it does work]
Someone will probably mention that hardware task switching isn't portable (and won't even work in long mode). I wouldn't worry about this though - regardless of how you do task switches it won't be portable, as different CPUs have different registers that need to be saved/reloaded.
The main problem with hardware task switching is that it's slow. For me, task switches are done by kernel code and the kernel's segment registers are constants, which means the task switch itself never needs to save and reload any segment registers (and the CPU never needs to do the privilege checks and GDT lookups caused by segment register reloads). There's also additional stuff I do during task switches (like accounting for time used by the task) which means that most of the general registers need to be saved and reloaded on the stack anyway, and there's no need to save and load them again. This means that for me, almost everything that hardware task switching would do is unnecessary and can be skipped.
The only things that hardware task switching does that is actually useful (for me) is switching ESP and setting the TS flag, but hardware task switching costs 300 or more cycles and I can do these things myself much much faster.
Of course there's more involved in a task switch than that: working out the time used by the previous task, setting up the CPU's debugging and performance monitoring (for "per task" debugging and performance monitoring), programming the scheduler's "one shot" timer, etc but hardware task switching doesn't do any of this anyway.
If (for a specific CPU under specific circumstances) there's 100 cycles of "needed anyway" code, hardware task switching costs 400 cycles and software task switching costs 50 cycles; then it ends up being 500 cycles vs. 150 cycles.
Someone might try to point out that the performance difference is negligable - an example of this is our own OSdev wiki (see the "Performance Considerations" part of
the wiki's context switching page). This is
horrendously wrong...
Basically it assumes that a task runs for all the time it's given (e.g. 10 ms) before being interrupted by the timer IRQ (pipeline flush and switching from CPL=3 to CPL=0), and the kernel spends ages deciding which task to run next before doing the task switch and returning to the new task (switching from CPL=0 to CPL=3). These assumptions are all wrong, because often tasks don't consume all the time they're given (often they block waiting for I/O or are preempted by a higher priority task instead), the kernel may not need to spend time deciding which task to run next (e.g. if you're preempted by a higher priority task then switch immediately to that higher priority task instead of deciding anything), and the CPU is often already running at CPL=0 when a task switch is needed (no need to include the cost of doing "CPL=3 -> CPL=0 -> CPL=3" switches and the pipeline flush). Note: the CPU will already be running at CPL=0 if the task switch was caused by a kernel function (e.g. "sleep()" or "select()") or if the task switch is the result of an IRQ (e.g. a hard disk IRQ that unblocks a task that was waiting for data from the hard drive).
The type of OS you're doing also makes a difference. Consider a micro-kernel that uses message passing, where a lot of task switches are caused by sending a message to a higher priority task or tasks that need to block until they receive a message.
For example, the CPU might be running a low priority task when the user presses a key, the kernel's IRQ handler sends a message to the "keyboard driver" task causing an immediate task switch. The "keyboard driver" task sends a message to the "GUI task" and blocks, causing a task switch to the "GUI task". The GUI takes a look and sends a message to the active application and blocks, causing a task switch to the active application. The active application does a little processing and sends an "update my video" message to the GUI and blocks. The GUI sends a message to the video driver and blocks, the video driver updates the video and blocks, and the scheduler switches back to the low priority task that was running before the user pressed the key. In this case the overhead of the task switches can make a huge difference to performance - for e.g. the entire sequence might take 10000 cycles with software task switching, or 12100 cycles with hardware task switching (21% slower).
Note: I made all the numbers above up - in practice the effect on overall performance depends on far too many things (OS design, scheduler design, the exact CPU, the sequence of events, etc); but even "negligably faster" is better than "negligably slower" in all possible cases.
Cheers,
Brendan