How is multithreading done at the OS level?
-
- Posts: 17
- Joined: Thu Aug 08, 2019 8:21 am
How is multithreading done at the OS level?
Sorry if this is covered before, I wasn't able to find a link to it.
When doing multithreading in C, we take advantage of the pthread library and call functions like:
pthread_create
pthread_join
etc.
How is this implemented in the OS itself? Given a system with multiple cores,
how does the OS select a specific CPU core to run instructions on?
When doing multithreading in C, we take advantage of the pthread library and call functions like:
pthread_create
pthread_join
etc.
How is this implemented in the OS itself? Given a system with multiple cores,
how does the OS select a specific CPU core to run instructions on?
Re: How is multithreading done at the OS level?
Wrong levels of abstraction. To the OS scheduler, there are only ever tasks. Threads are tasks that share a single address space, but this is a mere detail. In fact, most schedulers don't really know, nor do they care to, whether the task they are scheduling in is single-threaded or multi-threaded. It is simply not necessary to know.
As for how the scheduler picks the CPU to run on, there are several factors to weigh against each other. Each CPU can only run a single task at any given time, so it might be prudent to move a task from a busy CPU to one that is less busy, but it is usually advantageous to limit CPU migrations. But, there is no cost associated with executing a new task on another CPU. However, I have to make clear that these days, most of the time, no task is runnable, and if a task becomes runnable it does not remain so for very long. So more than one CPU is almost never used. Beyond this advice I would suggest reading up on multi-CPU scheduling techniques, because that is what I will do when I get there.
As for how the scheduler picks the CPU to run on, there are several factors to weigh against each other. Each CPU can only run a single task at any given time, so it might be prudent to move a task from a busy CPU to one that is less busy, but it is usually advantageous to limit CPU migrations. But, there is no cost associated with executing a new task on another CPU. However, I have to make clear that these days, most of the time, no task is runnable, and if a task becomes runnable it does not remain so for very long. So more than one CPU is almost never used. Beyond this advice I would suggest reading up on multi-CPU scheduling techniques, because that is what I will do when I get there.
Carpe diem!
-
- Posts: 17
- Joined: Thu Aug 08, 2019 8:21 am
Re: How is multithreading done at the OS level?
Thanks but that didn't answer my question. I'm interested to know how a specific cpu is picked to run instructions on.nullplan wrote:Wrong levels of abstraction. To the OS scheduler, there are only ever tasks. Threads are tasks that share a single address space, but this is a mere detail. In fact, most schedulers don't really know, nor do they care to, whether the task they are scheduling in is single-threaded or multi-threaded. It is simply not necessary to know.
As for how the scheduler picks the CPU to run on, there are several factors to weigh against each other. Each CPU can only run a single task at any given time, so it might be prudent to move a task from a busy CPU to one that is less busy, but it is usually advantageous to limit CPU migrations. But, there is no cost associated with executing a new task on another CPU. However, I have to make clear that these days, most of the time, no task is runnable, and if a task becomes runnable it does not remain so for very long. So more than one CPU is almost never used. Beyond this advice I would suggest reading up on multi-CPU scheduling techniques, because that is what I will do when I get there.
Let's say i'm writing my own "hello world" kernel using this tutorial: https://wiki.osdev.org/Bare_Bones
Assuming the machine has 1 cpu, all the assembly code will just run on that one CPU. What if there are more CPUs and I decide run this all on another CPU, how do I accomplish this?
Are there specific instructions to do so? or specific procedures?
Re: How is multithreading done at the OS level?
The most useful mental model I have of an OS in a multi-CPU system is that all CPUs are executing the same operating system independently of each other. There are specific instructions you can use to start the other CPUs and make them run whatever code you want, but after that you just have them execute the scheduler, and that will somehow load a task into the CPU when there actually is one.
Getting a bit more concrete, the procedure to start secondary CPUs in an x86 based system is described in Intel's Software Developers' Manual. Once you know the local APIC IDs of an enabled CPU, you can send it an INIT IPI, followed by a brief wait and a startup IPI, followed by a long wait and another startup IPI if the CPU failed to respond. That will start the other CPU, but it will run in 16-bit protected mode. Indeed, the initial address will be xx00:0000, where xx is equal to the vector number sent in the startup IPI. This means that the initial code for the AP has to be loaded below the 1MB mark. In my case, the trampoline code will transition to 32-bit protected mode, followed by 64-bit long mode (the page tables must have been taken care of by the BSP), and then it will jump into the entry point I have specified. That entry point will initialize GDT and IDT to their final versions, free the trampoline memory, and execute the scheduler. At this point, the scheduler will execute the idle task, since the CPU has only just announced that it is usable, so there will be no task in its queue. Once a task has been created that actually could run on the other CPU, the scheduler notices the unused CPU lying around and sends it an IPI after queuing up the task. The IPI wakes the AP from its slumber and makes it execute the new task.
I really can't get much more concrete, because at this point this is all highly OS-specific. If you don't know how to send an IPI, I suggest reading up on the APIC. An IPI is vectored like an external interrupt, so you just install your handler in the appropriate IDT slot.
Getting a bit more concrete, the procedure to start secondary CPUs in an x86 based system is described in Intel's Software Developers' Manual. Once you know the local APIC IDs of an enabled CPU, you can send it an INIT IPI, followed by a brief wait and a startup IPI, followed by a long wait and another startup IPI if the CPU failed to respond. That will start the other CPU, but it will run in 16-bit protected mode. Indeed, the initial address will be xx00:0000, where xx is equal to the vector number sent in the startup IPI. This means that the initial code for the AP has to be loaded below the 1MB mark. In my case, the trampoline code will transition to 32-bit protected mode, followed by 64-bit long mode (the page tables must have been taken care of by the BSP), and then it will jump into the entry point I have specified. That entry point will initialize GDT and IDT to their final versions, free the trampoline memory, and execute the scheduler. At this point, the scheduler will execute the idle task, since the CPU has only just announced that it is usable, so there will be no task in its queue. Once a task has been created that actually could run on the other CPU, the scheduler notices the unused CPU lying around and sends it an IPI after queuing up the task. The IPI wakes the AP from its slumber and makes it execute the new task.
I really can't get much more concrete, because at this point this is all highly OS-specific. If you don't know how to send an IPI, I suggest reading up on the APIC. An IPI is vectored like an external interrupt, so you just install your handler in the appropriate IDT slot.
Carpe diem!
Re: How is multithreading done at the OS level?
Are "legacy-free" systems different? It'll be interesting if they're not. (Now I'm wondering about running different cores in different modes, but that's another topic.)nullplan wrote:That will start the other CPU, but it will run in 16-bit protected mode.
Kaph — a modular OS intended to be easy and fun to administer and code for.
"May wisdom, fun, and the greater good shine forth in all your work." — Leo Brodie
"May wisdom, fun, and the greater good shine forth in all your work." — Leo Brodie
Re: How is multithreading done at the OS level?
No, they are not. But nullplan probably meant 16-bit real mode unless I am missing something.
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
Re: How is multithreading done at the OS level?
Maybe try to look it the other way around: all cpu is running in parallel, and you don't pick a cpu to run a task on, rather you pick a task on each and every cpu. When you move a task from one cpu core to another, that's a special function called task migration. So all cpus are running the same scheduler, and they all have their own task queue. If one queue is empty, and another is full, then you'll need to do task migration, moving some tasks from the full queue to the empty one. After that both cpus will have a scheduler queue with tasks to pick from.nightcrawler wrote:I'm interested to know how a specific cpu is picked to run instructions on.
Cheers,
bzt
Re: How is multithreading done at the OS level?
YES, yes, that is what I meant. Geez, what a bad mistake to make.Korona wrote:No, they are not. But nullplan probably meant 16-bit real mode unless I am missing something.
"Legacy-free" typically refers to the selection of hardware on the mainboard. At least, I have never heard it used otherwise in the context of PC hardware. As such, a legacy-free system may be lacking a traditional BIOS, and it may lack a PIT and a PIC pair, and an ISA DMA unit (etc. I can do this all day), but it will still boot up in 16-bit real mode. It will be quickly made to transition into another mode, but that's where it will start, as long as Intel don't release new CPUs that lack these modes. In that case we will all have to rewrite our trampolines, but that's a story for another time.eekee wrote:Are "legacy-free" systems different?
Well, the different CPUs are entirely independent of each other, other than sharing physical memory, so if you want to do such a thing, knock yourself out. I don't know why you would want to, but you probably could.eekee wrote:Now I'm wondering about running different cores in different modes, but that's another topic.
This is why I'm not a teacher: you explain this so much better than me.bzt wrote:Maybe try to look it the other way around: all cpu is running in parallel, and you don't pick a cpu to run a task on, rather you pick a task on each and every cpu.
Carpe diem!
Re: How is multithreading done at the OS level?
Ah, thanks.Korona wrote:No, they are not. But nullplan probably meant 16-bit real mode unless I am missing something.
I see. I guess I will have to write that code one day. Not for a long time yet.nullplan wrote:"Legacy-free" typically refers to the selection of hardware on the mainboard. At least, I have never heard it used otherwise in the context of PC hardware. As such, a legacy-free system may be lacking a traditional BIOS, and it may lack a PIT and a PIC pair, and an ISA DMA unit (etc. I can do this all day), but it will still boot up in 16-bit real mode. It will be quickly made to transition into another mode, but that's where it will start, as long as Intel don't release new CPUs that lack these modes. In that case we will all have to rewrite our trampolines, but that's a story for another time.eekee wrote:Are "legacy-free" systems different?
Ultimate dosemu. Actually, this might seriously be useful for upgrading some ancient DOS-based process control system.nullplan wrote:Well, the different CPUs are entirely independent of each other, other than sharing physical memory, so if you want to do such a thing, knock yourself out. I don't know why you would want to, but you probably could.eekee wrote:Now I'm wondering about running different cores in different modes, but that's another topic.
That way of looking at it makes it much clearer to me too.bzt wrote:Maybe try to look it the other way around: all cpu is running in parallel, and you don't pick a cpu to run a task on, rather you pick a task on each and every cpu.
Kaph — a modular OS intended to be easy and fun to administer and code for.
"May wisdom, fun, and the greater good shine forth in all your work." — Leo Brodie
"May wisdom, fun, and the greater good shine forth in all your work." — Leo Brodie