Now, looking at Brendan's tutorial https://wiki.osdev.org/Brendan%27s_Mult ... g_Tutorial, his switch_to_task example:
Code: Select all
;C declaration:
; void switch_to_task(thread_control_block *next_thread);
;
;WARNING: Caller is expected to disable IRQs before calling, and enable IRQs again after function returns
switch_to_task:
;Save previous task's state
;Notes:
; For cdecl; EAX, ECX, and EDX are already saved by the caller and don't need to be saved again
; EIP is already saved on the stack by the caller's "CALL" instruction
; The task isn't able to change CR3 so it doesn't need to be saved
; Segment registers are constants (while running kernel code) so they don't need to be saved
push ebx
push esi
push edi
push ebp
mov edi,[current_task_TCB] ;edi = address of the previous task's "thread control block"
mov [edi+TCB.ESP],esp ;Save ESP for previous task's kernel stack in the thread's TCB
;Load next task's state
mov esi,[esp+(4+1)*4] ;esi = address of the next task's "thread control block" (parameter passed on stack)
mov [current_task_TCB],esi ;Current task's TCB is the next task TCB
mov esp,[esi+TCB.ESP] ;Load ESP for next task's kernel stack from the thread's TCB
mov eax,[esi+TCB.CR3] ;eax = address of page directory for next task
mov ebx,[esi+TCB.ESP0] ;ebx = address for the top of the next task's kernel stack
mov [TSS.ESP0],ebx ;Adjust the ESP0 field in the TSS (used by CPU for for CPL=3 -> CPL=0 privilege level changes)
mov ecx,cr3 ;ecx = previous task's virtual address space
cmp eax,ecx ;Does the virtual address space need to being changed?
je .doneVAS ; no, virtual address space is the same, so don't reload it and cause TLB flushes
mov cr3,eax ; yes, load the next task's virtual address space
.doneVAS:
pop ebp
pop edi
pop esi
pop ebx
ret ;Load next task's EIP from its kernel stack
I've tried looking on this forum and some people suggesting just setting it to the top of the stack, and never touching it again. I tried both approaches, when I dynamically change the top of the stack (setting it in the tss each time before switching to the task) I get random page faults that I cannot trace, probably something to do with stack corruption because the EIP was set to pop gs in the syscall handler when the page fault happened.
When I never touch the ESP0 value, it works perfectly fine, heres a screenshot (its only set once at the beginning to the top of the stack, plus the IRET and schedule() frame, so top - 40 bytes)
As you can see the userland thread successfully returns to the syscall in which it was interrupted.
And here's me updating the ESP0 in the TSS before switching to the task each time:
As you can see, it's a page fault at a completely random address, and both the faulty address and the address of the instruction are different each time...
(That happens after the scheduler tries to switch from ring3 task to any other)
So then my question is what should I set the ESP0 to? The current kernel stack top for the thread? When do I update it? The Brendan's tutorial doesn't show any code where he updates the ESP0 ever.
Would really appreciate a detailed response on how to handle this, my brain is melting at this point...
UPD: after thinking about it more, I think I get it now. ESP0 always points to the beginning of the stack because whenever it's fetched from the TSS it's 100% empty, so it doesn't make sense for it to point anywhere other than the beginning of the stack. Because if it's not empty that means that we're already in ring0 therefore it's never fetched from the TSS. Is this correct?