Hi,
Octacone wrote:This is what I've learned:
*Each task should be given a time slice aka quantum, higher priority tasks get smaller quantum while lower priority ones get bigger.
*Interrupts should only preempt tasks if they run out of their quantum aka don't block on time.
Most interrupts (e.g. disk controller's IRQ, network card's IRQ, USB controller's IRQ, etc) should (possibly indirectly, after a sequence of things) cause the currently running task to be preempted (without waiting for the currently running task's time slice to end) whenever they cause a higher priority task to unblock.
For example (for a monolithic kernel); maybe you get an IRQ from the USB controller; so the USB controller driver does some work and sends some data to a USB flash driver, then the USB flash diver does some work and sends some data to some file system code, then the file system code does some work and sends some data to the VFS layer, then the VFS layer does some work and realises that 4 different tasks were all waiting for that data (e.g. because those tasks called "read()" or maybe because of page faults in a memory mapped file area) and tells scheduler to unblock those 4 tasks. Then, after all these separate pieces have all done a little work (that began with an IRQ), the scheduler realises that one of the tasks that was unblocked has higher priority than the currently running task and does an immediate task switch.
The timer interrupt is little special because it's more direct (there's no "driver calls driver that calls driver that..." involved); and because it handles 2 different causes of task switches - waking up sleeping tasks (e.g. if a task calls "sleep(2);" then the task will block, and then 2 seconds later the timer interrupt will realise 2 seconds has passed and unblock the task again) and check for "end of time slice".
Octacone wrote:*Most of the task switches don't happen because of the timer but because the task called System_Call_Block_For_This_Thread_And_Put_It_To_Sleep_And_Switch_To_Another_Thread_While_Waiting();
I'd say that (for crude guesses - it depends too much on what kinds of programs the user feels like running, the scheduler's algorithms, etc):
- 50% of task switches are caused by the currently running task blocking (either because they called kernel API function like "sleep()" or "read()" or...; or because of a page fault where the page fault handler had to make the task wait for data or memory)
- 25% of task switches are caused by a higher priority task unblocking and preempting the currently running task
- 10% of task switches are caused by a task using its entire time slice
- 8% of task switches are caused by tasks being terminated (or crashing)
- 4% of task switches are caused by tasks being spawned
- 3% of task switches are caused by miscellaneous stuff (e.g. tasks reducing their own priority)
Note that I'd recommend having 2 general purpose functions in your scheduler - one that other kernel code can use for blocking the currently running task, like "block_task(int reason);"; and another that other kernel code can use for unblocking a task, like "unblock_task(task_info * task);".
For "block_task(int reason);" the scheduler would update the currently running task's state (set the reason why the task is blocked) and remove the task from whichever list of list of ready to run tasks it was on, and then call "Schedule()" to choose another task to give CPU time to.
For "unblock_task(task_info * task);"; the scheduler would update the requested task's state (e.g. set the reason why the task is blocked to "none") and put the task back on a list of tasks that are ready to run; then check if the unblocked task should preempt or not, and if it should preempt the scheduler would call "goto_task()" directly (skipping the overhead of choosing which task to switch to because it already knows).
Also don't forget that a lot of the time (especially for micro-kernels - not quite so much for monolithic kernels) tasks switches are caused by tasks waiting to receive data from other tasks, and then unblocking when some other task sends data to them; where no devices (and no IRQs) are involved at all. For a simple example, one task might call "printf()" causing it to write data to STDOUT, and then a shell or a terminal emulator might be unblocked because it was waiting to receive that data, and because the shell or terminal emulator is likely to be higher priority it will probably preempt; so you end up with "printf() causes a task switch".
Octacone wrote:What I thought I knew but after reading all of this don't seem to know:
*What is a proper way to initialize multitasking? What should my Initialize(); function contain? To just make a fake stack frame (like I'm doing atm) and do an iretd while popping all the necessary stuff.
I like to think of it like this: When the first instruction in the boot loader is executed the OS is already running a task. The problem is that this initial task has none of the meta-data that the kernel wants tasks to have (e.g. where that meta-data might be some kind of "task info block" containing the task's name, how much CPU time the task has used, which virtual address space the task uses, what state the task is currently in, etc; plus maybe an entry in a list of running tasks that the scheduler uses to determine which task gets CPU time next). When the kernel initialises multi-tasking what it's really doing is creating the missing meta-data for a task that already exists.
Octacone wrote:*What should I set my initial EBP to? When creating a brand new task.
If you're using a language and calling convention that uses "stack frames", and if you didn't tell the compiler to omit stack frames to improve performance, and if you have some kind of kernel debugger that relies on EBP to find stack frames (and don't have a more modern debugger that doesn't rely on EBP to find stack frames); then I'd be tempted to set EBP to zero so that it's easier for the debugger to know when a "back trace" can't trace back further (to before the task existed). For all other cases EBP is just a normal register and you can do whatever you like with it.
Octacone wrote:*How am I supposed to save the registers without using multitasking_registers_t* pointer returned by my timer handler? C to Assembly is a one way street? Inline assembly to save them maybe?
I'd write a function in pure assembly (not in inline assembly where compiler assumes it has complete control of the stack) to do the low-level task switch; similar to the "goto_task:" example code I showed before. For C code to call your function you'll probably need to tell the compiler that the function is "extern" or something; and you will need to write your assembly to match the calling conventions that your C compiler expects.
Octacone wrote:*Should there be a standalone kernel thread? I don't see its purpose, nor how to create it.
You already have one kernel thread (the initial task that has existed since boot); but (assuming its a monolithic kernel again) eventually you're probably going to want more kernel threads to do various things (e.g. update DNS cache in the background, figure out which pages to send to swap or prefetch, etc).
Note that there are 2 different ways of handling idle CPUs. For some kernels, if there's no tasks that can be run, the scheduler waits (e.g. does HLT until a task is unblocked) and during this time no task is running at all. For other kernels there's a special "idle task" (that uses the lowest priority possible and never blocks) to make sure that there's always at least one task that the scheduler can give CPU time to.
The latter is easier because you don't need to handle the "no task running" case anywhere; but the former is a little more powerful (especially when you start doing power management and need to do things like change CPU speed).
Octacone wrote:*How is my task supposed to enable IRQs? Within Switch_To_Thread(); function?
No; you'd disable IRQs in all the pieces of code that call the "Switch_To_Thread()" or the "Schedule()" function (before calling the function); and you'd enable IRQs again in all the pieces of code that call the "Switch_To_Thread()" or the "Schedule()" function (after the function returns). The "Schedule()" and "Switch_To_Thread()" functions wouldn't touch interrupts at all.
Part of the reason for this is to avoid race conditions. For example, you might have many pieces of code scattered throughout the kernel that do some work and then ask scheduler to do a task switch, where you do not want an IRQ happening in between the work and the task switch; so you disable IRQs and then do the work and then do the task switch to make sure an IRQ can't interrupt between the work and the task switch.
Octacone wrote:*Again I can't understand how is your idea supposed to work. There is no way to return to Hardware::Enable_Interrupts(); since threads don't return anywhere. One you switch context there is not way to get back to any kernel code (except if an IRQ occurs but that will just take just to Hardware::Disable_Interrupts();) after that point. I know I might be wrong, again I don't understand this at all.
Imagine if "Switch_To_Thread()" did literally nothing, like this:
Code: Select all
void Switch_To_Thread(thread_info * thread) {
// Do nothing at all
}
And you have a piece of code somewhere that does this:
Code: Select all
Hardware::Disable_Interrupts();
Switch_To_Thread(thread);
Hardware::Enable_Interrupts();
Obviously in this case the "Hardware::Enable_Interrupts();" would be executed, because "Switch_To_Thread(thread);" doesn't do anything at all.
Now, what if your task calls "Switch_To_Thread(thread);" and it did a task switch and then 123 other tasks were given CPU time; but eventually your task gets CPU time again and "Switch_To_Thread(thread);" returns as if nothing happened? This is almost exactly the same as before - the only difference is that (as far as you can tell) "Switch_To_Thread(thread);" took a long time to do nothing. It still returns like any other function would; and "Hardware::Enable_Interrupts();" is still executed.
Octacone wrote:*Circular linked list doesn't have an end, how am I supposed to check for a null pointer in order to determine when to switch to the next lower priority queue?
For a circular linked list, if there's anything on the list you have a pointer to something on the list, and if there's nothing on the list you can't have a pointer to something on the list (the pointer to something on the list is NULL).
This was shown in the example code I posted previously:
Code: Select all
// WARNING: Caller is responsible for making sure IRQs are disabled before calling this!
void Schedule(void) {
something * thread;
if(high_priority_queue != null) {
thread = high_priority_queue;
high_priority_queue = thread->next;
} else if(medium_priority_queue != null) {
thread = medium_priority_queue;
medium_priority_queue = thread->next;
} else if(normal_priority_queue != null) {
thread = normal_priority_queue;
normal_priority_queue = thread->next;
} else if(idle_priority_queue != null) {
thread = idle_priority_queue;
idle_priority_queue = thread->next;
} else {
// No tasks are running!
}
Switch_To_Thread(thread);
}
Cheers,
Brendan