Problems with execution rings, scheduling & multitasking
Posted: Fri Dec 29, 2023 6:09 am
Greetings! My multitasking and scheduling system is a sort of broken...
Way it works:
- Simple robbin-round scheduler (no priorities), chooses next task from an array if it's deemed ready to run.
- Tasks themselves can be either inside kernel execution or inside userland. That is controlled by the kernel_task boolean inside the Task structure. That struct also includes the pagedirectory, stack pointer, etc.
- Switching occurs when an IRQ0 hits. On switching, it simply switches stack pointers and page directories.
- (since I have my testing shell registered as a kernel task) tasks[0] is always controlled by the kernel directly... Getting rid of this didn't help resolve the issues though.
It's not completely broken... Lemme explain:
- Kernel tasks: If all tasks are kernel tasks, switching between them endlessly works fine and exactly as expected. Note that they all have different page directories that do have user-modifiable flags.
- User tasks: Remember that the first task is always a kernel task. When adding at least one user task, it sort of breaks. The switches are as follows
-- From 0 - 0: no issues
-- user task 1 gets added
-- From 0 - 1: no issues
-- From 1 - 0: no issues
-- From 0 - 1: HERE, qemu crashes. It shows a pagefault at let's say memory address 0xc010783d which upon objdumping is inside switch_context (on the task.asm file) and more specifically when popping some miscellaneous registers. After some messing around, I found it was due to the kernel stack pointer (ESP) being set to some garbage value: 0xffffff50. I also noticed that ESP value is set to 0xffffff50 directly after the first context switch from 0 - 1, meaning this is probably not caused by the scheduling itself.
I honestly have no idea why this weird behavior takes place, considering kernel tasks work absolutely fine with no problems whatsoever. I've tried a lot of stuff, but it would be nice if someone more experienced could perhaps help... Thanks!
(If anyone wants to actually test the code, in order for the userspace ring to be activated, on the elf.c file set create_task's third argument to true before compiling)
Images (posted to imgur because this forum's uploading was buggy for me): https://imgur.com/a/shXrGWQ
task.asm: https://github.com/malwarepad/cavOS/blo ... g/task.asm
schedule.c: https://github.com/malwarepad/cavOS/blo ... schedule.c
Way it works:
- Simple robbin-round scheduler (no priorities), chooses next task from an array if it's deemed ready to run.
- Tasks themselves can be either inside kernel execution or inside userland. That is controlled by the kernel_task boolean inside the Task structure. That struct also includes the pagedirectory, stack pointer, etc.
- Switching occurs when an IRQ0 hits. On switching, it simply switches stack pointers and page directories.
- (since I have my testing shell registered as a kernel task) tasks[0] is always controlled by the kernel directly... Getting rid of this didn't help resolve the issues though.
It's not completely broken... Lemme explain:
- Kernel tasks: If all tasks are kernel tasks, switching between them endlessly works fine and exactly as expected. Note that they all have different page directories that do have user-modifiable flags.
- User tasks: Remember that the first task is always a kernel task. When adding at least one user task, it sort of breaks. The switches are as follows
-- From 0 - 0: no issues
-- user task 1 gets added
-- From 0 - 1: no issues
-- From 1 - 0: no issues
-- From 0 - 1: HERE, qemu crashes. It shows a pagefault at let's say memory address 0xc010783d which upon objdumping is inside switch_context (on the task.asm file) and more specifically when popping some miscellaneous registers. After some messing around, I found it was due to the kernel stack pointer (ESP) being set to some garbage value: 0xffffff50. I also noticed that ESP value is set to 0xffffff50 directly after the first context switch from 0 - 1, meaning this is probably not caused by the scheduling itself.
I honestly have no idea why this weird behavior takes place, considering kernel tasks work absolutely fine with no problems whatsoever. I've tried a lot of stuff, but it would be nice if someone more experienced could perhaps help... Thanks!
(If anyone wants to actually test the code, in order for the userspace ring to be activated, on the elf.c file set create_task's third argument to true before compiling)
Images (posted to imgur because this forum's uploading was buggy for me): https://imgur.com/a/shXrGWQ
task.asm: https://github.com/malwarepad/cavOS/blo ... g/task.asm
schedule.c: https://github.com/malwarepad/cavOS/blo ... schedule.c