Context not switching (only running one process)
-
- Posts: 6
- Joined: Mon Jan 13, 2025 9:56 am
Context not switching (only running one process)
Hi y'all, I've been working on an OS in C++ for the last few months. I'm working on multi-tasking right now and am having trouble successfully switching context to another process and having all processes run simultaneously. The result is Process 1 runs infinitely, and Process 2 is never switched-to.
Two things I'm running into:
1. Only one task is running at a time.
2. Interrupts (like the keyboard) aren't working. Which may indicate why the timer interrupt isn't firing either?
As far as I understand it, I am initializing a process with a function pointer to the 'task' and allocating a stack for that task to use. When switching contexts, the current CPU registers and segments are saved into the current process and the state of the next process is loaded into those same registers. Then at the end you jump to the instruction pointer (eip) location.
https://gist.github.com/thomascswalker/ ... 544583962f
The meat of this is in scheduling.cpp, the add and schedule functions, and in scheduling.asm.
Thanks for any help!
Two things I'm running into:
1. Only one task is running at a time.
2. Interrupts (like the keyboard) aren't working. Which may indicate why the timer interrupt isn't firing either?
As far as I understand it, I am initializing a process with a function pointer to the 'task' and allocating a stack for that task to use. When switching contexts, the current CPU registers and segments are saved into the current process and the state of the next process is loaded into those same registers. Then at the end you jump to the instruction pointer (eip) location.
https://gist.github.com/thomascswalker/ ... 544583962f
The meat of this is in scheduling.cpp, the add and schedule functions, and in scheduling.asm.
Thanks for any help!
-
- Member
- Posts: 5754
- Joined: Mon Mar 25, 2013 7:01 pm
Re: Context not switching (only running one process)
Context switching only happens inside your kernel, and your kernel always uses the same segments, so you shouldn't waste time saving or restoring them. User context is saved when an interrupt occurs and restored when returning from that interrupt, not when switching tasks.thomaswalker wrote: ↑Sun Mar 16, 2025 5:12 pmWhen switching contexts, the current CPU registers and segments are saved into the current process and the state of the next process is loaded into those same registers.
You also aren't saving the current CPU state correctly; one of the first things you do in your switchContext function is overwrite registers that must be preserved.
No, at the end you return. When you initialize a new task context, you pre-fill the new stack with whatever data is necessary for your context switch to return to the correct location the first time the new task runs. A context switch involves switching stacks, and the return address is on the stack.thomaswalker wrote: ↑Sun Mar 16, 2025 5:12 pmThen at the end you jump to the instruction pointer (eip) location.
The wiki has a pretty good guide on multitasking.
-
- Posts: 6
- Joined: Mon Jan 13, 2025 9:56 am
Re: Context not switching (only running one process)
Thanks for the direction! I did go through this at one point and didn't have any luck, perhaps I didn't implement it correctly. I trimmed down my existing ASM code:
Same problem. I know that I'm moving `prev` and `next` into edi/esi, are those registers you shouldn't overwrite? My understanding is they're just general purpose registers which, in this case, don't need to be preserved.
The tutorial you linked only passed in one argument and had the other state as a global variable. Would you recommend that method?
Code: Select all
section .text
global switchContext
switchContext:
; Move the pointers to prev and next from the stack into
; registers edi and esi, respectively.
mov edi, [esp + 4] ; EDI ← pointer to CPUState "prev"
mov esi, [esp + 8] ; ESI ← pointer to CPUState "next"
; Save the current EBP and ESP into the prev CPUState.
mov [edi + 24], ebp ; prev->ebp <- EBP
mov [edi + 28], esp ; prev->esp <- ESP
; Load the next EBP and ESP from the next CPUState.
mov ebp, [esi + 24] ; EBP <- next->ebp
mov esp, [esi + 28] ; ESP <- next->esp
; Restore the flags from the next CPUState.
push dword [esi + 64] ; Push flags onto the stack
popfd ; Pop flags into EFLAGS
push dword [esi + 56] ; Push next->eip onto the stack
sti ; Enable interrupts
ret ; Return to the next instruction
The tutorial you linked only passed in one argument and had the other state as a global variable. Would you recommend that method?
-
- Posts: 6
- Joined: Mon Jan 13, 2025 9:56 am
Re: Context not switching (only running one process)
For example, in Brendan's tutorial in step 1, where are current_task_TCB, TCB, and TSS defined? Are they defined in C as:
The examples are great but there seems to be little context for some of the variables/structs outside of just "define this" which then makes it confusing.
Code: Select all
extern thread_controL_block* current_task_TCB;
extern thread_control_block* TCB;
extern ??? TSS;
-
- Member
- Posts: 5754
- Joined: Mon Mar 25, 2013 7:01 pm
Re: Context not switching (only running one process)
Your compiler follows the System V i386 psABI, which states that EBX, ESP, EBP, ESI, and EDI must be preserved across function calls.thomaswalker wrote: ↑Mon Mar 17, 2025 8:31 amI know that I'm moving `prev` and `next` into edi/esi, are those registers you shouldn't overwrite? My understanding is they're just general purpose registers which, in this case, don't need to be preserved.
No. The example code uses global variables because that makes the example assembly code easier to read. In a real OS, you wouldn't use global variables for this because it would make multiprocessing impossible.thomaswalker wrote: ↑Mon Mar 17, 2025 8:31 amThe tutorial you linked only passed in one argument and had the other state as a global variable. Would you recommend that method?
Personally, I'd simplify the stack-switch function to just switching stacks. That way it needs two arguments: the new stack pointer and a place to store the old stack pointer. The caller can deal with everything else.
Speaking of which, the caller needs to set up everything for the new task before it calls the stack-switch function. Perhaps this line of code needs to move.
They're not defined anywhere because the example code is only illustrative, but if the example code were part of a functional OS, "current_task_TCB" and "TSS" could indeed be defined in C.thomaswalker wrote: ↑Mon Mar 17, 2025 9:00 amFor example, in Brendan's tutorial in step 1, where are current_task_TCB, TCB, and TSS defined?
"TCB" isn't a variable, it's the assembly equivalent of the thread_control_block data type. The example code uses it to reference thread_control_block members by name.
The tutorial assumes you're already pretty familiar with OS concepts and low-level development and just need help bridging the gap between the two. If you've been learning as you go, that might be why you feel like there's no context.thomaswalker wrote: ↑Mon Mar 17, 2025 9:00 amThe examples are great but there seems to be little context for some of the variables/structs outside of just "define this" which then makes it confusing.
-
- Member
- Posts: 829
- Joined: Fri Aug 26, 2016 1:41 pm
- Libera.chat IRC: mpetch
Re: Context not switching (only running one process)
Octo's done a great job of explaining. I had never used Brendan's Wiki task switching code (or even looked at the page previously) until someone was trying to get it working last year. While I would move anything not stack switching related out of the assembly code (to make things easier) I chose to try a literal interpretation of the Wiki explanation. The result was some code that you may find helpful as a *starting point*. The project I helped get basic task switching working for was IlanVinograd's OS_32bit found here: https://github.com/IlanVinograd/OS_32Bit . What you may wish to look at is the task switching assembly file https://github.com/IlanVinograd/OS_32Bi ... tasksw.asm which has additional comments and structures used. This code doesn't change CR3 but you can easily migrate that from the tutorial into this example.
The developer of this OS uses PCB (Process Control Blocks) but you may be inclined to use TCB (Thread Control Block) which IMHO makes more sense given that processes usually have one or more threads and is more in line with Brendan's tutorial.
You will also want to look at the process control block PCB header file here: https://github.com/IlanVinograd/OS_32Bi ... udes/PCB.h and if you are going to be task switching user mode code you will also want to look at https://github.com/IlanVinograd/OS_32Bi ... /gdt.h#L24 . There is a very basic scheduler in https://github.com/IlanVinograd/OS_32Bi ... cheduler.c . Code in the kernel main to set things up can be found here: https://github.com/IlanVinograd/OS_32Bi ... rnel.c#L31 as well as basic PCB functions here: https://github.com/IlanVinograd/OS_32Bi ... rces/PCB.c .
Note the code in question has evolved since I first helped out but the general ideas and how they relate to Brendan's Wiki are similar. There may be bugs I'm not aware of. Showing a concrete example someone is using might provide better insights for your own understanding of Brendan's task switching Wiki so you can develop your own.
Of special note is something Octo mentioned which is very important:
The developer of this OS uses PCB (Process Control Blocks) but you may be inclined to use TCB (Thread Control Block) which IMHO makes more sense given that processes usually have one or more threads and is more in line with Brendan's tutorial.
You will also want to look at the process control block PCB header file here: https://github.com/IlanVinograd/OS_32Bi ... udes/PCB.h and if you are going to be task switching user mode code you will also want to look at https://github.com/IlanVinograd/OS_32Bi ... /gdt.h#L24 . There is a very basic scheduler in https://github.com/IlanVinograd/OS_32Bi ... cheduler.c . Code in the kernel main to set things up can be found here: https://github.com/IlanVinograd/OS_32Bi ... rnel.c#L31 as well as basic PCB functions here: https://github.com/IlanVinograd/OS_32Bi ... rces/PCB.c .
Note the code in question has evolved since I first helped out but the general ideas and how they relate to Brendan's Wiki are similar. There may be bugs I'm not aware of. Showing a concrete example someone is using might provide better insights for your own understanding of Brendan's task switching Wiki so you can develop your own.
Of special note is something Octo mentioned which is very important:
. The code above is a starting point for a single processor OS but far from ideal or usable when moving to SMP (multiple processor support).No. The example code uses global variables because that makes the example assembly code easier to read. In a real OS, you wouldn't use global variables for this because it would make multiprocessing impossible.
-
- Posts: 6
- Joined: Mon Jan 13, 2025 9:56 am
Re: Context not switching (only running one process)
Thank you both for your guidance! I did try to follow IlanVinograd's OS and those links you provided Michael, but am still running into issues. I tried to visualize what the stack looked like and how everything works... Does this look accurate?
Code: Select all
extern current
global switchTask
struc PCB
.PID resd 1 ; Offset: 0
.STATE resd 1 ; Offset: 4
.PROGRAM_COUNTER resd 1 ; Offset: 8
.STACK_POINTER resd 1 ; Offset: 12
.STACK_TOP resd 1 ; Offset: 16
.FLAGS resd 1 ; Offset: 20
.NEXT resd 1 ; Offset: 24
endstruc
section .text
; void switchTask(Task* next)
switchTask:
; __________Stack___________
; |------------------------|
; | Return Address (EIP) | <- Stack top [0]
; |------------------------| ESP
; | Task* next | +4
; |------------------------|
; Push volatile registers (per the 32-bit System V ABI)
push ebx
push esi
push edi
push ebp
; __________Stack___________
; |------------------------|
; | EBP | -16 <- ESP
; |------------------------|
; | EDI | -12
; |------------------------|
; | ESI | -8
; |------------------------|
; | EBX | -4
; |-------- ---------------|
; | Return Address (EIP) | <- Stack top [0]
; |------------------------|
; | Task* next | +4
; |------------------------|
; Save current task's state
mov edi, [current]
; EDI now contains the pointer to [Task* current]
; |-----|
; | EDI | <== [Task* current]
; |-----|
; Set [current->PCB.STACK_POINTER] to the value in ESP
; Task* current->PCB.STACK_POINTER <== ESP
mov [edi + PCB.STACK_POINTER], esp
; Load next task's state
mov esi, [esp + 4]
; ESI now contains the pointer to [Task* next]
; __________Stack___________
; |------------------------|
; | Return Address (EIP) | 0
; |-----| |------------------------|
; | ESI | <== | Task* next | +4
; |-----| |------------------------|
; Set [EXTERN Task* current] to the value in ESI
; Task* current <== ESI (Task* next)
mov [current], esi
mov esp, [esi + PCB.STACK_TOP]
; |-----| |-----|
; | ESP | <== | ESI |
; |-----| |-----|
; Restore non-volatile registers (per the 32-bit System V ABI)
pop ebp
pop edi
pop esi
pop ebx
; __________Stack___________
; |------------------------|
; | Return Address (EIP) | <- Stack top [0]
; |------------------------|
ret
-
- Member
- Posts: 829
- Joined: Fri Aug 26, 2016 1:41 pm
- Libera.chat IRC: mpetch
Re: Context not switching (only running one process)
I didn't look thoroughly but you seem to have a bit of a misunderstanding as to what happens to ESP when you push things on the stack. On each `push` ESP is decremented by 4. When `switchTask` is entered the ESP does point at the return address and ESP+4 points at the next task pointer, but each push decrements ESP by 4. By the time you get to Loading next task's pointer in ESI the `*next` is no longer at ESP+4, it is at ESP+20. I could have made that more clear in Vinograd's code comments when I did this `mov esi,[esp+(4+1)*4]` . The 4 represents the 4 pushes, the 1 is the return address so (4+1)*4=20 is the offset from ESP to the first parameter on the stack.
You will see I have commented that you should be using `mov esi, [esp + 20]` instead of `mov esi, [esp + 4]` .thomaswalker wrote: ↑Wed Mar 19, 2025 11:42 amCode: Select all
; void switchTask(Task* next) switchTask: ; At this point ESP+0 points at the return address and ESP+4 points at `Task *next` ; Push volatile registers (per the 32-bit System V ABI) push ebx ; At this point ESP+4 points at the return address and ESP+8 points at `Task *next` push esi ; At this point ESP+8 points at the return address and ESP+12 points at `Task *next` push edi ; At this point ESP+12 points at the return address and ESP+16 points at `Task *next` push ebp ; At this point ESP+16 points at the return address and ESP+20 points at `Task *next` ; Save current task's state mov edi, [current] ; Set [current->PCB.STACK_POINTER] to the value in ESP ; Task* current->PCB.STACK_POINTER <== ESP mov [edi + PCB.STACK_POINTER], esp ; Load next task's state mov esi, [esp + 20] ;<-------------------- Change ESP+4 to ESP+20 [snip]
-
- Member
- Posts: 829
- Joined: Fri Aug 26, 2016 1:41 pm
- Libera.chat IRC: mpetch
Re: Context not switching (only running one process)
Upon having some extra time to look at this you have:
It is unclear what you will be using PCB.STACK_TOP for but that seems wrong for this code. I believe you want `mov esp, [esi + PCB.STACK_TOP]` to be `mov esp, [esi + PCB.STACK_POINTER]` just as you did with `mov [edi + PCB.STACK_POINTER], esp` earlier.
ESP always points to the current top of the stack.
Code: Select all
[snip]
; Set [current->PCB.STACK_POINTER] to the value in ESP
; Task* current->PCB.STACK_POINTER <== ESP
mov [edi + PCB.STACK_POINTER], esp
; Load next task's state
mov esi, [esp + 4] ; <---------- Needs to be ESP+20
; ESI now contains the pointer to [Task* next]
; Set [EXTERN Task* current] to the value in ESI
; Task* current <== ESI (Task* next)
mov [current], esi
mov esp, [esi + PCB.STACK_TOP] ; <--------- should be PCB.STACK_POINTER??
ESP always points to the current top of the stack.
-
- Posts: 6
- Joined: Mon Jan 13, 2025 9:56 am
Re: Context not switching (only running one process)
Thank you so much! I finally got this to work. Something to note is that I was calling 'schedule()' from my Timer callback, which seemed to be messing up the actual scheduling. Now I'm only calling `schedule()` from the `while` loop in each task, as well as the infinite `while` loop at the end of my `main` function.
Code: Select all
void process1()
{
while (1)
{
printf("Process 1 is running: %d\n", i++);
System::schedule();
}
}
void process2()
{
while (1)
{
printf("Process 2 is running: %d\n", j++);
System::schedule();
}
}
extern void kmain(...)
{
...
while (1)
{
System::schedule();
}
}
-
- Member
- Posts: 829
- Joined: Fri Aug 26, 2016 1:41 pm
- Libera.chat IRC: mpetch
Re: Context not switching (only running one process)
You should be able to call schedule from within the interrupt handler, so you must be doing something not quite right. By calling Schedule() from within tasks only you have made a cooperative multitasking system (not pre-emptive).
If you created a Github repo with all your code/build files/Makefiles etc I could take a look. One thing that can go wrong in an interrupt handler is if you don't do the EOI in the timer interrupt before calling the scheduler. I'm not sure you've implemented a basic lock/unlock mechanism that can be used when calling the scheduler directly from a task - In a pre-emptive environment you'd need something like that.
If you created a Github repo with all your code/build files/Makefiles etc I could take a look. One thing that can go wrong in an interrupt handler is if you don't do the EOI in the timer interrupt before calling the scheduler. I'm not sure you've implemented a basic lock/unlock mechanism that can be used when calling the scheduler directly from a task - In a pre-emptive environment you'd need something like that.
-
- Member
- Posts: 829
- Joined: Fri Aug 26, 2016 1:41 pm
- Libera.chat IRC: mpetch
Re: Context not switching (only running one process)
I should have looked through your Github repos referenced in your Gists. I see PenguinOS seems to have the code in question. Right off the bat I see that although your code is mostly yours there is code roughly based on some code from the James Molloy tutorial (or you saw other code based on that tutorial). The OSDev Wiki has errata for that tutorial which includes a bug related to the possible corruption of the stack by interrupt handlers. See: https://wiki.osdev.org/James_Molloy%27s ... pted_state . I have written about this interrupt stub/handler issue on Stackoverflow with a way to fix it: https://stackoverflow.com/a/56486184/3857942 . The bug is related to passing the stackframe (cpustate) by value on the stack instead of by reference (pointer). You should be able to apply that kind of change to your interrupt stubs and handlers.
Whether that was causing the grief or not I don't know but I do know that this has caused issues for many people doing OSDev based on that tutorial. If there is any other code from that tutorial you may have used I'd look through the errata I've linked to earlier.
Whether that was causing the grief or not I don't know but I do know that this has caused issues for many people doing OSDev based on that tutorial. If there is any other code from that tutorial you may have used I'd look through the errata I've linked to earlier.
-
- Member
- Posts: 829
- Joined: Fri Aug 26, 2016 1:41 pm
- Libera.chat IRC: mpetch
Re: Context not switching (only running one process)
I had some time this evening to make a pull request at https://github.com/thomascswalker/PenguinOS/pull/31 with fixes that may allow you to get pre-emptive multitasking working. The code of course doesn't work with SMP. It added lock/unlock scheduler; a yield function; fixed the interrupt stubs that previously passed CPUState by value and not by reference per my earlier comments; a processStartup function that gets executed prior to a new process being executed for the first time (currently it just unlocks the scheduler); changed where EOI handling is done so that an interrupt handler can send an EOI when it needs to and not necessarily at the end of the handler; in the timer interrupt handler send the EOI before calling the scheduler (not after).
-
- Posts: 6
- Joined: Mon Jan 13, 2025 9:56 am
Re: Context not switching (only running one process)
Thank you so much Michael! Your explanations helped a lot, as well as your PR. On to the next goal haha