Page 1 of 1

implementing threads

Posted: Sat May 22, 2021 4:31 pm
by newosdeveloper2021
Im having an issue and its implementing threads and basic scheduling, I'm looking for simple code to look at (not to paste), just wondering how they work

Re: implementing threads

Posted: Sun May 23, 2021 12:58 am
by neon
Hi,

For perhaps the simplest scheduling scheme: let us assume a round robin approach. We can also assume preemptive multitasking as that is probably the goal here: so we would use a timer to count the amount of time a thread has executed to decide if we need to preempt it.

A basic round robin scheduler... don't overcomplicate this one. It is quite literally just a linked list of threads. Thats all. Implement a Schedule function that just selects the next item in the list. That is, if you have a linked list: how would you go about such that every time you call Schedule it would return the next item in the list (and going back to the first when all out)?

To create a thread... Threads must store an execution context (register state). Threads have an associated Process that has an Address Space. You would have a function like CreateThread(process, entry_point, props) that would clear the execution context of the thread and set its context EIP/RIP to entry_point. The thread would be created for "process" and added to the Scheduler. Thread->process should refer to the parent Process object.

To switch to a thread... we have to switch to the thread before we can execute it. We call the Scheduler to tell us what thread to switch to. If the thread->process address space is different, we will need to switch to the new address space.

To execute a thread... we need to restore the threads execution context (register state.) This is typically done in assembly language: we PUSH the items we need back on the thread-local stack and then use IRET to run it. IRET is used as it supports switching the CPU context to user mode as well as kernel mode.

It can be helpful to experiment with the IRET instruction and consider how we can set whatever the return CS:IP is for it.

To preempt a thread... As long as the timer interrupt is not blocked, it will always call the timer ISR. The System needs to track the duration of the time that each thread is ran for. When enough time has passed, the ISR needs to do the following: (1) Save the execution context of the current thread, (2) call Schedule to get the next thread, (3) execute this thread.

Semaphores and mutual exclusion / critical sections... be absolutely sure you have a good understanding of these.

Putting everything together... It is helpful to keep in mind that the kernel is shared memory: it is effectively code added to any thread that executes it. So when a thread is preempted by the timer ISR -- we are still in that thread, just executing kernel code. So the timer ISR has a part of the execution context of the current thread already on the stack pushed by the CPU. The ISR saves the current context to the thread. ISR calls an operating system Tick function that may call Schedule. If the next thread is in a new process, we switch to the new address space. When the ISR returns, it restores whatever context is in the current thread (which is selected by Schedule) and issues IRET.

--
Due to the amount of pieces involved with multithreading, it can be helpful to pinpoint specific areas you are not sure about or would like more clarification or samples with.

Re: implementing threads

Posted: Sun May 23, 2021 7:52 am
by newosdeveloper2021
[quote="neon"][/quote]
Since CPU's support a limited amount of actual, physical threads, is there a way to interface with the CPU to run actual threads? And as typical(I guess) those threads are at various priority levels running code for other threads

EDIT: I mean all the physical cores can be part of the scheduler

Re: implementing threads

Posted: Sun May 23, 2021 8:35 am
by rdos
neon wrote: To create a thread... Threads must store an execution context (register state). Threads have an associated Process that has an Address Space. You would have a function like CreateThread(process, entry_point, props) that would clear the execution context of the thread and set its context EIP/RIP to entry_point. The thread would be created for "process" and added to the Scheduler. Thread->process should refer to the parent Process object.
I think you over-complicate stuff by passing thread create a process context. Any normal usage of thread creation would create the thread in the current process. When you create a process, you also create the first thread, but this should be implicit in the process creation. Typically, initial thread creation for a process will not pass entrypoints (at least not user space entrypoints), rather this is resolved as the application program is loaded and the executable header will indicate the entry-point.
neon wrote: To switch to a thread... we have to switch to the thread before we can execute it. We call the Scheduler to tell us what thread to switch to. If the thread->process address space is different, we will need to switch to the new address space.
Switching process should be as simple as reloading CR3, and the CR3 could be saved in the thread control block so the task switcher doesn't have to know about processes.
neon wrote: To execute a thread... we need to restore the threads execution context (register state.) This is typically done in assembly language: we PUSH the items we need back on the thread-local stack and then use IRET to run it. IRET is used as it supports switching the CPU context to user mode as well as kernel mode.
I think saving register state on the stack is a bad approach. It should be saved in the thread control block. That makes it much easier to dump thread state (and change it). There is also FPU state, which also should be saved in the thread control block when needed.

My design has a per-core scheduler stack. As part of saving thread state, the scheduler stack is loaded, and then when a new thread is switched to, the new thread stack is loaded. When a thread's state is saved, the return address is specified (and saved on the thread stack), and when the thread is loaded again, the last operation is to ret back to the return address. The return point can be anywhere, and there can be additional state on the thread kernel stack. This makes it easier to block threads in various contexts since there is no need for an iret frame to exist.
neon wrote: It can be helpful to experiment with the IRET instruction and consider how we can set whatever the return CS:IP is for it.
The scheduler should not rely on iret frames.

Rather, the preemption IRQ should call the scheduler to switch thread, and when the same thread is resumed, it would execute the iret part of the normal IRQ handler. However, it must be noted that the IRQ must be cleared (EOI) prior to calling the scheduler.

Re: implementing threads

Posted: Sun May 23, 2021 8:54 am
by neon
Hi,
is there a way to interface with the CPU to run actual threads?
Of course. But I do highly recommend already having the framework for multithreading in place first. There is hardware tasks -- and in fact you must always have one for use with user mode -- however due to hardware multitasking being very limited it is typically recommended to implement multithreading in software.

However... what you are talking about is a separate but related -- Multiprocesing -- which is described by the MP standard. For purposes of discussion, a "processor" here is a "logical processor" which is a "processor core". You can have one physical processor with 4 cores: or 4 logical processors. Continuing: the hardware selects a bootstrap processor (BSP) and application processors (AP)'s. Each processor has a local APIC (LAPIC). The processors can interface with each other via Inter-Processor Interrupts (IPI)'s through an I/O APIC (IOAPIC). Each processor has a unique LAPIC ID which is used to uniquely identify a processor. During startup, the BSP can send an IPI to the different processors to wake them. The AP's start in real mode just like the BSP so we would typically have it run a stub code that configures the AP into 32 bit protected or 64 bit long mode and then jump into the kernel main loop so it can execute threads. I believe the Wiki has articles on MP support that may be helpful for samples.

In summary... you are asking about MP support here which is separate from multithreading. MP support does require APIC support (which in turn requires looking for the IOAPIC via PCI bus). You would have the BSP send a Wake IPI to the other AP's that will in turn run stub code to initialize themselves and then have them fall into the kernel main loop. The Scheduler can assign based on load (or some other metric such as priority or affinity mask) to a processor using the Processor ID.

However... don't confuse MP with multithreading. For your sanity, I do highly recommend setting up multithreading first before attempting to add MP support: however keep in mind concurrent programming practices throughout.
I think you over-complicate stuff
No worries, please feel free to correct anything that you see: I wrote it very late last night and kept rewriting it thinking how best to describe everything as I wanted to keep things precise and tried to be as clear as possible. My response isn't based on any design: I wanted to focus on the theory rather then design as I believe that is the most important to understanding multithreading. I do have working sample code for threading already on my site so didn't see any benefit of rewriting it here so wanted to focus on just the theory.
I think saving register state on the stack is a bad approach
It is indeed a bad idea: the quote there discusses restoring the thread execution context not storing it. Sorry if this was confusing -- I tried to make it as simple as possible but ironically made some of it sound more complicated ha -- basically I agree with everything you said.