Page 1 of 2

Help with very persistent scheduler bug

Posted: Fri Apr 28, 2017 1:38 pm
by prasoc
Hello
I have issues with my scheduler. It switches the first few times until it gets back to the start, then the registers get trashed and stop working. I've tried a multitude of different ways to capture the registers and save them, but nothing I try seems to be working, so I need a little help with getting this final issue resolved. I will give a quick run-down of the code,

My scheduler is linked to my PIT, and each timer tick runs the following code:

Code: Select all

static void timer_irq0(struct registers * r) {
    timer_tenths++;

    if(timer_tenths > 100) {
        Scheduler::next(r);
        timer_tenths = 0;
    }
}
This is call from within the "isr_handler" function, based on Brendan's code:

Code: Select all

extern "C" struct registers * isr_handler(registers_t* registers) {
	if ((registers->isr_num >= IRQ0) && (registers->isr_num <= IRQ15)) { // IRQ
		if (registers->isr_num >= IRQ8) { // IRQ originated from the slave PIC
			outportb(PIC2_COMMAND, PIC_EOI);
		}
		outportb(PIC1_COMMAND, PIC_EOI);
	}

	if (isr_handlers[registers->isr_num] != 0) {
		isr_handlers[registers->isr_num](registers);
	}

	return registers;
}
So the registers get pulled through and passed to my isr_handler function. This then calls the corresponding handler (which, for my PIT handler, alters the registers), and then finally returns the registers which then restores the state of the system.

The Scheduler then runs this function:

Code: Select all

void Scheduler::next(registers* r) {
	if(!has_initialised) {
		memcpy(&base_state, r, sizeof(registers));
		has_initialised = true;
		return;
	}

	if(!thread_list[task_idx]->ran) {
		// if this is the first time the thread has been scheduled,
		// we need to initialize it's state register beforehand

		thread_list[task_idx]->state_reg.eflags = base_state.eflags;
		thread_list[task_idx]->state_reg.cs = base_state.cs;
		thread_list[task_idx]->state_reg.ss = base_state.ss;

		thread_list[task_idx]->ran = true;
	}


	int lastThread = task_idx - 1;

	if(task_idx == 0)
		lastThread = thread_list.size()-1;
	
	// save the previous threads state
	if(thread_list[lastThread]->ran) {
		memcpy(&(thread_list[lastThread]->state_reg), r, sizeof(registers));
	}

	// set the registers from the current thread's saved state
	memcpy(r, &(thread_list[task_idx]->state_reg), sizeof(registers));

	task_idx++;

	// Loop around 0 -> 1 -> 2 -> 0 -> ...
	if(task_idx >= thread_list.size()) {
		task_idx = 0;
	}
}
which handles a LOT of the inner details in one place; this is where I have been trying lots of different options. Each task is an element of a std::vector<Thread*>, and has the following schema:

Code: Select all

class Thread {
public:
	size_t thread_id;
	registers state_reg;
	
	uintptr_t entry_ptr;
	
	bool ran;

	Thread() : ran(false) {}
};
as you can see, it's very minimal (I've cut it right down to debug the class). "entry_ptr" is the memory address of the function that is ran upon thread execution.

Finally, to start a thread I run this function:

Code: Select all

void start_thread(void_fn entry) {
	Thread* thread = new Thread();

	thread->thread_id = next_tid;

	thread->entry_ptr = reinterpret_cast<uintptr_t>(entry);

	thread->state_reg.eax = 0;
	thread->state_reg.ebx = 0;
	thread->state_reg.ecx = 0;
	thread->state_reg.edx = 0;

	thread->state_reg.esi = 0;
	thread->state_reg.edi = 0;

	thread_list.push_back(thread);

	next_tid++;
}
I have been banging my head against the wall trying to figure this out, it *almost* works, but only runs through ONCE!

Re: Help with very persistent scheduler bug

Posted: Fri Apr 28, 2017 1:59 pm
by sleephacker

Code: Select all

   // save the previous threads state
   if(thread_list[lastThread]->ran) {
      memcpy(&(thread_list[lastThread]->state_reg), r, sizeof(registers));
   }
When the first ever thread is created, is Thread.ran set to true? Because if not, it's registers won't be saved when the thread after that is scheduled

Re: Help with very persistent scheduler bug

Posted: Fri Apr 28, 2017 3:26 pm
by prasoc
sleephacker wrote:

Code: Select all

   // save the previous threads state
   if(thread_list[lastThread]->ran) {
      memcpy(&(thread_list[lastThread]->state_reg), r, sizeof(registers));
   }
When the first ever thread is created, is Thread.ran set to true? Because if not, it's registers won't be saved when the thread after that is scheduled
Not at the beginning. In the constructor of each "Thread" object, "ran" (a member variable) is set to false, which is set to true once the thread runs at least once, this code runs inside the "scheduler::next" function to initialise the state:

Code: Select all

	if(!thread_list[task_idx]->ran) {
		// if this is the first time the thread has been scheduled,
		// we need to initialize it's state register beforehand
		memcpy(&(thread_list[task_idx]->state_reg), &base_state, sizeof(registers));

		thread_list[task_idx]->state_reg.eflags = base_state.eflags;
		thread_list[task_idx]->state_reg.cs = base_state.cs;
		thread_list[task_idx]->state_reg.ss = base_state.ss;

		thread_list[task_idx]->state_reg.eip = (uint32_t)thread_list[task_idx]->entry_ptr;
		thread_list[task_idx]->ran = true;
	}
this is done because I have to grab certain variables that have to be initialised run-time (eflags, cs, ss)

Re: Help with very persistent scheduler bug

Posted: Sat Apr 29, 2017 12:55 am
by Brendan
Hi,
prasoc wrote:I have issues with my scheduler.
Let's talk about task switches for a little while.

Initially you can get by with just saving/restoring general purpose registers and nothing else, and it seems small. When you add support for user-space you'd going to have to check if CR3 needs to be changed and change it. When you add support for MMX/FPU/SSE/AVX you'll have to worry about saving and loading all that state (possibly using a more complex "lazy state saving" thing). When you want to keep track of how much CPU time each thread or process has used you're going to have to do some kind of "elapsed = current_time - last_time;" thing in task switches. When you add support for debugging you'll need to save/restore the debug registers. If you add support for "per task performance monitoring" you'll need to reconfigure the CPU's performance monitoring stuff during task switches too. When you add support for thread local storage you'll have to diddle with a segment register (e.g. change "FS base"). If/when you add support for multi-CPU you'll probably need an actual lock so that a different CPU doesn't mess everything up while you're in the middle of task switching. Of course for general purpose registers, for C (and C++) calling conventions the compiler guarantees that certain registers will be preserved, and this means code that does a task switch only needs to save/restore the registers that calling convention guarantees will be preserved (in addition to all the other stuff).

Now let's talk about IRQs. For an OS under load you might (eventually) have network card/s, disk controller/s, USB controller/s, etc all going flat out all producing thousands of IRQs per second. The IRQ handlers need to be fast. For C (and C++) calling conventions the compiler guarantees that certain registers will be preserved, and this means that the IRQ handler's "assembly stub" does not need to save/restore registers that calling convention guarantees will be preserved.

If you compare IRQs and task switches, you'll find that there almost nothing an IRQ handler saves/restores that needs to be saved/restored during task switches, and for the large amount of things that a task switch will do eventually none of it makes sense for an IRQ handler.

Now let's talk about schedulers. Almost all task switches happen because a kernel API function was called. It's things like "read()" and "write()" and "getMessage()" and "sendMessage()" and "sleep()"; where the task has to block until something happens and the scheduler has to find some other task to switch to because the current task got blocked. For example, maybe one task reads some bytes from a pipe (causing it to block because there's no bytes to read yet, causing a task switch), then another task writes some bytes to "stdout" (causing the first task to be unblocked, and causing another task) - for this example, there isn't any IRQs involved at all. For a typical OS under normal load, for the majority of task switches there are no IRQs involved.

Essentially; you're going to end up with IRQ handlers (that need to be fast) that are bogged down by a large pile of completely unnecessary bloat for task switching for no sane reason at all. The higher level (C or C++) code for an IRQ handler has no reason to want any of the registers; and they can and should be like "static void timer_irq0(void)".

Now let's talk about debugging schedulers. They seem simple at first, but the task switch code itself gets complex (as mentioned above), and the code that chooses which task to switch to gets complex too (multiple scheduling policies, task priorities, preemption rules, multi-CPU support and locking and "per CPU" queues/data, etc; followed by power management when there's nothing for a CPU to do); but it's worse than that. It's worse because there's lots of places for race conditions throughout all of the scheduler's code, and this means you can expect "heisenbugs" - bugs that only happen under extremely rare conditions, bugs that disappear as soon as you try to debug them, etc. Complex code (that goes beyond the compiler and debugger's capabilities) combined with a high risk of race conditions is a recipe for months of pain. When something goes wrong you need to be able split the complexity into easily tested pieces - you do not want a big mangled mess of "IRQs plus choosing task plus doing task switch" where you can't (e.g.) test the "do task switch" code in isolation.

Now let's talk about optimisation. A lot of optimisation involves avoiding unnecessary work. For example, rather than doing a bunch of work to figure out why your current scheduler is buggy (and then deleting the scheduler because it's bad for all of the reasons above), you could optimise your time by not bothering to figure out why your current scheduler is buggy. ;)


Cheers,

Brendan

Re: Help with very persistent scheduler bug

Posted: Sat Apr 29, 2017 1:46 am
by evoex
First of all, Brendan: while I completely agree with what you wrote on a technical level, I don't think he should change anything he's doing right now. While I can't speak for the OP, personally I do it similar to him. Reason is that my first priorities in an OS are making a PoC and writing readable code. If I get stuck on the details like optimization I will never progress and never finish. Also, optimization only becomes important when you get more implemented than just a small OS will often contain.
So I would say; keep going the way you are and gradually improve your OS into something you are happy with. That may be a tiny OS, that may be a lot more.

Anyway, back to the issue, I'm not completely sure about what could cause the bug as you describe it, but I can see something that would cause a similar bug. While executing a thread you don't keep track of the thread you are actually executing, but the one past that, and then assume it's the previous thread that is being executed (i.e. lastThread). However, when the thread index is "0" then lastThread is the last thread in the list. If you somehow created a new thread in the meantime, then you will write to that thread instead of the one that's actually executing.
I suggest you keep track of the *current* thread and only advance the index if you start executing the next one.

This could trash registers of some threads, but not all of them, so it doesn't quite match your bug description.

Re: Help with very persistent scheduler bug

Posted: Sat Apr 29, 2017 2:11 am
by Brendan
Hi,
evoex wrote:First of all, Brendan: while I completely agree with what you wrote on a technical level, I don't think he should change anything he's doing right now. While I can't speak for the OP, personally I do it similar to him. Reason is that my first priorities in an OS are making a PoC and writing readable code. If I get stuck on the details like optimization I will never progress and never finish. Also, optimization only becomes important when you get more implemented than just a small OS will often contain.
So I would say; keep going the way you are and gradually improve your OS into something you are happy with. That may be a tiny OS, that may be a lot more.
You misunderstand me. It's not about optimisation and there is nothing wrong with "simple". The issue is more like "simple (and well designed and easy to extend or improve later)" vs. "simple (and poorly designed and hard to extend or improve later)".

It's a little more than that though - it's also about breaking the "Let's build a scheduler directly into a timer IRQ handler" nonsense (which is about as bad as "Let's build a shell directly into a keyboard IRQ handler" or "Let's build a web browser directly into a network card IRQ handler").


Cheers,

Brendan

Re: Help with very persistent scheduler bug

Posted: Sat Apr 29, 2017 5:30 am
by prasoc
great replies, a lot of food for thought here. I understand that my implementation is shockingly bad at the moment, but I am learning as I'm going. This means that the code WILL be scrapped in the future (eg. when I code myself into a box) but then I will start anew and code it better. this "OS" is being programmed to understand the ideas and concepts - maybe in the future I can get something that works better :)

In the future when I will have to deal with lots of interrupts, then I can scrap my current implementation and trim the functions down but I am currently at step 0 (trying to get a very basic, limited version working). I don't think I will be able to optimise until I can grasp this, unfortunately.

it's very frustrating for me, I'm *sure* that the ideas that go into my scheduler are correct, but it only partially works :/
Anyway, back to the issue, I'm not completely sure about what could cause the bug as you describe it, but I can see something that would cause a similar bug. While executing a thread you don't keep track of the thread you are actually executing, but the one past that, and then assume it's the previous thread that is being executed (i.e. lastThread). However, when the thread index is "0" then lastThread is the last thread in the list. If you somehow created a new thread in the meantime, then you will write to that thread instead of the one that's actually executing.
I suggest you keep track of the *current* thread and only advance the index if you start executing the next one.
"If you somehow created a new thread in the meantime" currently, 3 threads are loaded at run-time (before the scheduler starts functioning), this is something I will have to worry about very soon. Practically here, I will try and change my Thread class to be a linked list and see if that has any more success...

Re: Help with very persistent scheduler bug

Posted: Sat Apr 29, 2017 7:17 am
by prasoc
OK so I trimmed my scheduler function down to this:

Code: Select all

void Scheduler::next(registers * r) {
	if(!thread_running->ran) {
		// if this is the first time the thread has been scheduled,
		// we need to initialize it's state register beforehand

		thread_running->state_reg.eflags = r->eflags;
		thread_running->state_reg.esp = r->esp;
		thread_running->state_reg.isr_num = 0;

		thread_running->state_reg.eip = (uint32_t)thread_running->entry_ptr;
	
		thread_running->ran = true;
	} 
	else {
		// save the previous threads state
		memcpy(&(thread_running->state_reg), r, sizeof(registers));
	}

	thread_running = thread_running->next;

	if(thread_running->ran) {
		bcprintf("\nthread: %s (%d)", thread_running->title, thread_running->thread_id);

		// set the registers from the current thread's saved state
		memcpy(r, &(thread_running->state_reg), sizeof(registers));	
	}
}
which is (hopefully) much easier to grok. The "thread_running" is just a pointer to Thread, and each thread has "Thread* prev; Thread* next;" making it a linked list - this simplifies the function a LOT!

The issue remains :( but at least there's less code that could go wrong now. Also, for further clarification here: the function "bcprintf" writes to the Bochs console

Re: Help with very persistent scheduler bug

Posted: Sat Apr 29, 2017 7:52 am
by evoex
@Brendan: Fair enough; and yes, the code can be improved on.

@prasoc:
Now you're initializing the thread registers only after it has run. That is, "!thread_running->ran" is wrong at that point as it ran, it being the thread that ran last.
Assuming that's not it (and I don't think so), I don't think the bug is in the code you're showing us. I may be missing something, but I'm generally quite good at spotting issues, and I didn't spot it so far.

Re: Help with very persistent scheduler bug

Posted: Sat Apr 29, 2017 7:57 am
by sleephacker
prasoc wrote:

Code: Select all

	if(!thread_running->ran) {
		// if this is the first time the thread has been scheduled,
		// we need to initialize it's state register beforehand

		thread_running->state_reg.eflags = r->eflags;
		thread_running->state_reg.esp = r->esp;
		thread_running->state_reg.isr_num = 0;

		thread_running->state_reg.eip = (uint32_t)thread_running->entry_ptr;
	
		thread_running->ran = true;
	} 
	else {
		// save the previous threads state
		memcpy(&(thread_running->state_reg), r, sizeof(registers));
	}
Where does 'r->esp' come from? I assume you copy it from the registers of whatever code was interrupted (because that's what 'r' is supposed to be if I understand it all correctly), which is fine for segment registers and eflags (not the most elegant way to initialise those, but it should work), but not for esp. If all the threads are initialised with the same (or very close) stack pointer, then they will all run fine the first time because they'll just overwrite whatever is stored below esp. But then when they are run again and try to pop what was previously there (which also happens when returning from an interrupt) their stack has been overwritten.

In addition to what @evoex said I'd reccomend just initialising all registers (including segments/flags/eip/esp) in your Thread constructor, makes it easier to find, and also removes time consuming if-statements from your scheduler function.

Re: Help with very persistent scheduler bug

Posted: Sat Apr 29, 2017 11:23 am
by prasoc
Hmm OK sleephacker, I've cleaned up the initialisation routine, so none of it is done in the ::next() function, and you're correct about the stack pointer being overwritten - I've tried to initialise the thread stack within the constructor of the thread, and then use that memory address (+ the stack size) as the esp pointer. It actually works a little better now, it runs roughly twice with no crashes! I'm getting there :)

One thing is that my stack pointer isn't being updated properly, maybe this is the last thing that needs fixing? It keeps overwriting my pointer! This is what is causing my issue, it seems.

Here is the scheduler code (now cut down immensely)

Code: Select all

void Scheduler::next(registers * r) {
	// save the previous threads state
	if(thread_running->ran) {
		memcpy(&(thread_running->state_reg), r, sizeof(registers));
	}

	// Lets move on with task switching!
	thread_running->ran = true;
	thread_running = thread_running->next;

	bcprintf("\nthread: %s (%d)", thread_running->title, thread_running->thread_id);
	// set the registers from the current thread's saved state
	memcpy(r, &(thread_running->state_reg), sizeof(registers));	
	
}
here is the output from creating the threads:

Code: Select all

thread stack pointer: 0xE0001168
thread stack pointer: 0xE00052F7
thread stack pointer: 0xE0009486
but then they get overwritten when the threads run!

Code: Select all

Old stack pointer (saved): 0xE00052F7
New stack pointer (saved): 0xC010ED3C
For some reason, the stack pointer doesn't "take" when changed

Re: Help with very persistent scheduler bug

Posted: Sat Apr 29, 2017 11:56 am
by Brendan
Hi,
prasoc wrote:

Code: Select all

// set the registers from the current thread's saved state
	memcpy(r, &(thread_running->state_reg), sizeof(registers));
This would overwrite the currently running thread's "old state" (from when the IRQ occurred) with the "old state" for task you're switching to; which means that the current thread's "old state" is now trashed.

Note that one day you're going to want to do something vaguely like:

Code: Select all

    ask_VFS_to_read_from_file();
    block_thread(reason);                // Wait for VFS and file system and whatever to read
                                         //  the file and unblock this thread
    status = process_results_from_VFS();
    return status;                       // Tell the thread that called this kernel API function what happened
For this (or anything vaguely similar) to actually work the task switching code has save and restore the thread's "current state" (the values in registers now) and not save/restore the thread's "old state" (that is obsolete by the time the task switch happens).


Cheers,

Brendan

Re: Help with very persistent scheduler bug

Posted: Sat Apr 29, 2017 12:22 pm
by prasoc
This would overwrite the currently running thread's "old state" (from when the IRQ occurred) with the "old state" for task you're switching to; which means that the current thread's "old state" is now trashed.
A few lines above I save the currently running threads old state, like this:

Code: Select all

memcpy(&(thread_running->state_reg), r, sizeof(registers));
(at this time, thread_running corresponds to the "old" thread before thread->next is called)

Is there an issue with those 2 memcpy lines? do they not work how I think they are working?

edit:
I've added the following line at the end of the scheduler function,

Code: Select all

set_stack_ptr(r->esp);
to "force" esp to take the right value. Here is the ASM (nasm syntax):

Code: Select all

global set_stack_ptr
set_stack_ptr:
	mov eax, [esp + 4]
	mov ebx, esp
	mov esp, eax

	mov ecx, 16
loop_stack:
	mov edx, [ebx + 4*ecx]
	push edx
	dec ecx
loop_bottom:
	cmp ecx, 1
	jne loop_stack

	ret
it also copies the "old" stack into the new thread. Does the idea of what I'm trying to achieve make sense? I'm still learning :)

Re: Help with very persistent scheduler bug

Posted: Sat Apr 29, 2017 1:00 pm
by Brendan
Hi,
prasoc wrote:
This would overwrite the currently running thread's "old state" (from when the IRQ occurred) with the "old state" for task you're switching to; which means that the current thread's "old state" is now trashed.
A few lines above I save the currently running threads old state, like this:

Code: Select all

memcpy(&(thread_running->state_reg), r, sizeof(registers));
(at this time, thread_running corresponds to the "old" thread before thread->next is called)

Is there an issue with those 2 memcpy lines? do they not work how I think they are working?
You're right - it's far more hideous than I first imagined.

Let's walk through it; assuming that a newly spawned task just finished its first timeslice:

Code: Select all

   // save the previous threads state
   if(thread_running->ran) {
      memcpy(&(thread_running->state_reg), r, sizeof(registers));
   }
Here, "if(thread_running->ran)" is false because it's the thread first time slice, so the thread's old state is never saved.

Code: Select all

   // Lets move on with task switching!
   thread_running->ran = true;
   thread_running = thread_running->next;
At this point the CPU is using the previous thread's stack, but "thread_running" points to a completely unrelated next thread(!).

Code: Select all

   bcprintf("\nthread: %s (%d)", thread_running->title, thread_running->thread_id);
   // set the registers from the current thread's saved state
   memcpy(r, &(thread_running->state_reg), sizeof(registers)); 
This copies the next thread's state to the previous thread's kernel stack(!); and trashes the previous thread's state that wasn't saved above (because it was the thread's first time slice).

After this, the function returns to its caller (which probably does more stuff, then returns to its caller which probably does more stuff, then returns to its caller that does more stuff, then ..... until we finally reach the original IRET). Any/all code between the end of the task switch code and finally reaching the original IRET runs with previous thread's stack but "thread_running" pointing to a completely unrelated next thread(!). This means that it's impossible to do much on the return path, impossible to support any kind of nested IRQs, impossible to do 5 task switches (triggered by 5 different IRQs) while in kernel space before returning to user-space, etc.


Cheers,

Brendan

Re: Help with very persistent scheduler bug

Posted: Sat Apr 29, 2017 3:28 pm
by prasoc
Well the idea of the "run" variable in each thread is that when the scheduler starts (loops around the first time), it doesn't actually SAVE the registers until it's ran at least once - otherwise you would overwrite the thread's state with nonsense.

I took the "save previous state" lines out so that each time the thread gets switched to, it starts the thread again from the original state_reg, and it works! Unfortunately I get a stack overflow pretty quickly, but it seems that my saving routine doesn't work as expected