[Solved] Strategy to set up a new process

codyd51 · Post by **codyd51** » Tue Jun 26, 2018 1:00 pm

Hello!

I'm reimplementing task-switching in my OS, using the discussion in this post as a guide: viewtopic.php?f=1&t=30601.

The way it works is, `context_switch` is periodically called from the PIT interrupt. The common IRQ handler saves and restores registers, so as long as `context_switch` leaves the machine with a valid mid-interrupt kernel stack, things should work fine.

The `context_switch` function looks like so:

Code: Select all

context_switch:
    ; EBX, ESI, EDI and EBP are callee-saved and can be used
	mov ebx, [esp+4] ; get new task's esp from arg to ebx
	push ecx
	push edx
	push eax
	call read_eip ; the jmp call will return to here
_context_entry:
	cmp esi, 0xdeadbeef ; if esi contains this magic value, then the task switch has completed and we should return to caller
	je _context_switch_ret

	push esp

	mov esp, ebx ; load new process's esp
	pop ebx ; pop eip into ebx
	pop eax ; pop eax into eax
	pop edx ; pop edx into edx
	pop ecx ; pop ecx into ecx
	
	mov esi, 0xdeadbeef ; magic value to detect when the new task has began executing
	jmp ebx

_context_switch_ret:
	ret

My question is, how should I set up a new process so that when returning from the interrupt, things will execute normally? The way I understand, when a new task runs for the first time, this will happen:

* load esp, pop address of _context_entry into eip, and zero into eax, edx, and ecx.
* ret gets executed, which will pop the stack frame.

This means that I must setup the initial stack so after the ret, control must return to the function which called context_switch(), which will return, and finally control will return to the interrupt handler, which will pop machine state and iret.

According to this, it seems the task construction must set up several stack frames, and must also know the address of the line after the context_switch() call. My intuition tells me this can't be the correct approach.

Could someone provide some insight on the correct way to setup a new task so it will return normally from the interrupt handler the first time it runs?

This is the current relevant code for constructing a new task:

Code: Select all

    uint32_t stack_size = 0x2000;
    char* stack = kmalloc(stack_size);
    uint32_t* stack_top = (uint32_t*)(stack + stack_size - 0x4); // point to top of malloc'd stack
    *(stack_top--) = 0xdeadd00d; //ecx
    *(stack_top--) = 0xcafed00d; //edx
    *(stack_top--) = 0xcafebabe; //eax
    *(stack_top) = & _context_entry; //eip
    initial_register_state.esp = (uint32_t)stack_top;
    initial_register_state.ebp = (uint32_t)stack_top;

    new_task->register_state = initial_register_state;

Thanks a lot!

Brendan · Post by **Brendan** » Tue Jun 26, 2018 9:56 pm

Hi,

codyd51 wrote:The `context_switch` function looks like so:

Code: Select all

context_switch:
    ; EBX, ESI, EDI and EBP are callee-saved and can be used
	mov ebx, [esp+4] ; get new task's esp from arg to ebx
	push ecx
	push edx
	push eax
	call read_eip ; the jmp call will return to here
_context_entry:
	cmp esi, 0xdeadbeef ; if esi contains this magic value, then the task switch has completed and we should return to caller
	je _context_switch_ret

	push esp

	mov esp, ebx ; load new process's esp
	pop ebx ; pop eip into ebx
	pop eax ; pop eax into eax
	pop edx ; pop edx into edx
	pop ecx ; pop ecx into ecx
	
	mov esi, 0xdeadbeef ; magic value to detect when the new task has began executing
	jmp ebx

_context_switch_ret:
	ret

The value for EIP is already saved on the thread's stack (by the call) and restored by the RET at the end, so there's no need to touch it. For example, the context switch code could/should be simplified to:

Code: Select all

context_switch:

;Save the old task's state

	;Save registers that are supposed to be callee-saved

	push eax
	push ebx
	push esi
	push edi
	push ebp

	mov esi,[current_task]  ;Get address of old task's "thread control block" from global variable
	mov [esi+TCB.esp],esp   ;Save old task' ESP in its "thread control block"

;Load the new task's state

	mov edi,[esp+4]         ;Get address of new task's "thread control block" from arg
	mov [current_task],edi  ;Set address of new task's "thread control block" in global variable for later
	mov esp,[edi+TCB.esp]   ;Load new ESP from new task's "thread control block"

	;Restore registers that were callee-saved

	pop ebp
	pop edi
	pop esi
	pop ebx
	pop eax

	;Return to the new task's EIP (that was stored on its stack)

	ret

codyd51 wrote:My question is, how should I set up a new process so that when returning from the interrupt, things will execute normally? The way I understand, when a new task runs for the first time, this will happen:

* load esp, pop address of _context_entry into eip, and zero into eax, edx, and ecx.
* ret gets executed, which will pop the stack frame.

Yes. More specifically, when creating a new task (it's "thread control block" and its kernel stack) match what your "context_switch:" routine expects (e.g. with things on its kernel stack in the correct order that they'll be popped of the stack).

codyd51 wrote:This means that I must setup the initial stack so after the ret, control must return to the function which called context_switch(), which will return, and finally control will return to the interrupt handler, which will pop machine state and iret.

According to this, it seems the task construction must set up several stack frames, and must also know the address of the line after the context_switch() call. My intuition tells me this can't be the correct approach.

No. The interrupt (and its IRET) has nothing to do with any of this; and if there was originally an IRQ (often there is not) that asked the scheduler to do a task switch then its IRET will happen when the task that was interrupted gets CPU time again and won't happen when a completely different task is running.

When creating a new task you put values on its stack to match your "context_switch:" routine; and one of those values will be the EIP for the new task to start running. This "new task's EIP" can/should point to some kind of thread initialisation function. For some cases (spawning a new process) the "new task's EIP" might point to code that creates an address space (and then an ELF loader or something), for some cases (spawning a new user-space thread in an existing process) it might be code that sets up thread local storage before "returning" to a (provided) address in user-space, and for some cases (spawning a new kernel thread) it might be any number of custom "kernel task initialisation" routines. Note that for some of these purposes it's nice to put a generic extra parameter on the new task's stack that the task's initial function might want (e.g. so you can do "new_thread_init(void *user_space_EIP);" or "new_kernel_task_do_something_with_foo(void *foo)").

codyd51 wrote:Could someone provide some insight on the correct way to setup a new task so it will return normally from the interrupt handler the first time it runs?

This is the current relevant code for constructing a new task:

For my example "context_switch:" routine (above) the code for creating a new task might a little bit more like:

Code: Select all

int create_task(void *startup_function, void *extra_parameter) {
    uint32_t stack_size = 0x2000;
    char* stack = kmalloc(stack_size);
    if(stack == NULL) return E_NO_MEMORY;
    uint32_t* stack_top = (uint32_t*)(stack + stack_size - 0x4); // point to top of malloc'd stack

    *(stack_top--) = extra_parameter; //Extra generic parameter for task's startup function
    *(stack_top--) = startup_function; //Address of task's startup function
    *(stack_top--) = 0; //eax
    *(stack_top--) = 0; //ebx
    *(stack_top--) = 0; //esi
    *(stack_top--) = 0; //edi
    *(stack_top) = 0; //ebp

    initial_register_state.esp = (uint32_t)stack_top;
    return OK;

Note: I usually get the syntax for function pointers in C wrong (and they make code harder to read anyway), so I've been lazy and used "void * startup_function" instead of a function pointer. For real code you should use a function pointer (e.g. maybe like "int create_task( *(void startup_function(void *)), void *extra_parameter) {"?).

Cheers,

Brendan

nullplan · Post by **nullplan** » Wed Jun 27, 2018 12:58 am

Brendan wrote:Note: I usually get the syntax for function pointers in C wrong (and they make code harder to read anyway), so I've been lazy and used "void * startup_function" instead of a function pointer. For real code you should use a function pointer (e.g. maybe like "int create_task( *(void startup_function(void *)), void *extra_parameter) {"?).

Well, then better yourself! To write a function pointer in C is simple: You first write a normal function declaration (e.g. "void startup_function(void*)"), and then you put parentheses around the name and an asterisk in front of it, e.g. "void (*startup_function)(void*)". That said, if you find yourself writing "int (*signal(int, int(*)(int)))(int);", then maybe it would have been better to write

Code: Select all

typedef int sighand_t(int);
sighand_t* signal(int, sighand_t*);

Oh yeah, and you can typedef function types and spare yourself from using the function pointer syntax ever.

codyd51 · Post by **codyd51** » Wed Jun 27, 2018 8:03 am

Thanks for the great answer, Brendan! I understand much better now. I was under the impression the new task needed to return from the interrupt handler, which according to your answer seems not to be the case.

Brendan wrote: No. The interrupt (and its IRET) has nothing to do with any of this; and if there was originally an IRQ (often there is not) that asked the scheduler to do a task switch then its IRET will happen when the task that was interrupted gets CPU time again and won't happen when a completely different task is running.

Just so I understand, this means that I must send the end-of-interrupt signal to the PIC before invoking any custom interrupt callbacks, is that correct? The way I have now, the EOI signal is sent after invoking any callbacks, so IRQs would be blocked until the original task runs again and finishes up its common interrupt handler, which would never happen since more interrupts wouldn't be sent. I'll change my interrupt stub logic so EOI is sent before invoking callbacks, and try implementing as per your comments.

Thanks so much!

Brendan · Post by **Brendan** » Wed Jun 27, 2018 8:28 am

Hi,

codyd51 wrote:
Brendan wrote:No. The interrupt (and its IRET) has nothing to do with any of this; and if there was originally an IRQ (often there is not) that asked the scheduler to do a task switch then its IRET will happen when the task that was interrupted gets CPU time again and won't happen when a completely different task is running.
Just so I understand, this means that I must send the end-of-interrupt signal to the PIC before invoking any custom interrupt callbacks, is that correct? The way I have now, the EOI signal is sent after invoking any callbacks, so IRQs would be blocked until the original task runs again and finishes up its common interrupt handler, which would never happen since more interrupts wouldn't be sent. I'll change my interrupt stub logic so EOI is sent before invoking callbacks, and try implementing as per your comments.

Typically an IRQ handler would do whatever it does (take care of the reason why the device sent an interrupt - e.g. fetching a byte from the PS/2 controller, starting the next disk IO command in its queue, ...), then send the EOI to the PIC (or IO APIC), then do anything involving the scheduler (e.g. wake up a task that was waiting for keyboard or disk or network), then it'd do the IRET. Of course in between "do anything involving the scheduler" and "do the IRET" the scheduler might let 1234 other/different tasks run, but that won't matter because the device can still send another IRQ (which would just interrupt a different task).

Note: For multi-CPU I find it very convenient to have two kinds of spinlocks - one that causes task switches to be postponed (for that CPU), and another that causes task switches to be postponed and also disables IRQs (for that CPU). For both, you'd increment a "task switches disabled" counter when the lock is acquired, and if anything asks the scheduler to do a task switch while this counter is non-zero you'd set a "task switch was postponed" flag and wouldn't do the task switch at that time. Then, when the lock is released you'd decrement the "task switches disabled" counter and if it becomes zero you'd check if the "task switch was postponed" flag was set, and if it was you'd do the task switch that was postponed. This same "task switch postponed" logic can be used within IRQ handlers to postpone any task switches until the end of the IRQ handler just by incrementing the same "task switches disabled" counter at the start of the IRQ handler and doing the same "decrement and test for zero" at the end of the IRQ handler. In general; this means that there's a less things you have to worry about within the IRQ handler and elsewhere (e.g. an IRQ handler can call a function that calls a function that calls a function that asks scheduler to do a task switch; and you won't need to keep track of what may call what to avoid problems). In my case (micro-kernel), most IRQ handlers just call a "sendMessage()" function (to send a message to the device driver/s in user-space), and this "task switch postponed" logic means that the IRQ handler/s don't need to care if "sendMessage()" does or doesn't cause a higher priority task (that was waiting for a message) to unblock and preempt the currently running (lower priority) task.

Cheers,

Brendan

codyd51 · Post by **codyd51** » Wed Jun 27, 2018 3:23 pm

Brendan wrote: Typically an IRQ handler would do whatever it does (take care of the reason why the device sent an interrupt - e.g. fetching a byte from the PS/2 controller, starting the next disk IO command in its queue, ...), then send the EOI to the PIC (or IO APIC), then do anything involving the scheduler (e.g. wake up a task that was waiting for keyboard or disk or network), then it'd do the IRET. Of course in between "do anything involving the scheduler" and "do the IRET" the scheduler might let 1234 other/different tasks run, but that won't matter because the device can still send another IRQ (which would just interrupt a different task).

Mind explodes

Brilliant! I had been under the incorrect assumption that an iret was necessary to keep the interrupt system working, but this was clearly wrong.

The new context switch logic works! I'm very excited, I've been banging my head against this for a few days. Thanks a lot for explaining all of that, and have a great day!

Marking this as solved.

OSDev.org

[Solved] Strategy to set up a new process

[Solved] Strategy to set up a new process

Re: Strategy to set up a new process

Re: Strategy to set up a new process

Re: Strategy to set up a new process

Re: Strategy to set up a new process

Re: Strategy to set up a new process