Why SMP?
Re: Why SMP?
I did much of my OS and scheduler before there were any (reasonable) SMP machines to buy, and it is a lot hazzle to convert a single core OS to a multicore one. I eventually succeeded, but it was a lot of work, and I truly would have failed without extensive use of SVN. Once I had hundreds of commits over several months that made the SMP core crash regularly, and I had no idea what I've done wrong. I eventually solved it with merging.
The only really significant difference between a SMP core and a single core OS is that in a SMP core you must use spinlocks instead of cli/sti for synchronization. So, if you put spinlocks in your code from start instead of sti/cli, then the move to SMP will be much easier.
And I think the main motivation that your OS should be able to run on multiple cores is that it is really cool feature to have code run truly in parallel.
I would also want to one day create some kind of monitor / hard real time extension that would run on a dedicated core. Could be used if I once get those fast AD converters that can run in up to several 100s of MHz sample frequency, and so would need to be read regularly in real-time.
The only really significant difference between a SMP core and a single core OS is that in a SMP core you must use spinlocks instead of cli/sti for synchronization. So, if you put spinlocks in your code from start instead of sti/cli, then the move to SMP will be much easier.
And I think the main motivation that your OS should be able to run on multiple cores is that it is really cool feature to have code run truly in parallel.
I would also want to one day create some kind of monitor / hard real time extension that would run on a dedicated core. Could be used if I once get those fast AD converters that can run in up to several 100s of MHz sample frequency, and so would need to be read regularly in real-time.
- Schol-R-LEA
- Member
- Posts: 1925
- Joined: Fri Oct 27, 2006 9:42 am
- Location: Athens, GA, USA
Re: Why SMP?
Fair point. There are also some 'lockless' synchronization models which work for multicore/multiprocessor systems, but they all have various requirements or limitations, and none of them are as straightforward as spinlocks IIUC - they would require a much more radical departure when starting from a lock-based single-core system than spinlocks would.rdos wrote:The only really significant difference between a SMP core and a single core OS is that in a SMP core you must use spinlocks instead of cli/sti for synchronization. So, if you put spinlocks in your code from start instead of sti/cli, then the move to SMP will be much easier.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
-
- Member
- Posts: 368
- Joined: Sun Sep 23, 2007 4:52 am
Re: Why SMP?
Adding more cores is like adding more carriages to a train. The train doesn't go faster, in fact it could even go a bit slower, but it can carry twice the amount of cargo in nearly the same time.
Of course, it is required that the cargo can be divided into parts that don't touch each other, because it goes into separate carriages.
Of course, it is required that the cargo can be divided into parts that don't touch each other, because it goes into separate carriages.
Re: Why SMP?
Lock-free data structures are not a general replacement for locks. You can implement certain abstract data types in a lock-free way but lock-free operations do not compose (i.e., the composition is not atomic anymore).
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
Re: Why SMP?
I would recommend the use of lock-free physical memory allocation. It's not easy, but since pagefaults that potentially will need physical memory allocation can happen almost everywhere (unless you use extensive protective methods), this will make that code a lot easier in multicore systems.
Re: Why SMP?
What do you mean "everywhere"? As in kernel? That's easy, don't let the kernel use demand paged memory.rdos wrote:I would recommend the use of lock-free physical memory allocation. It's not easy, but since pagefaults that potentially will need physical memory allocation can happen almost everywhere (unless you use extensive protective methods), this will make that code a lot easier in multicore systems.
As for lock-free physical memory allocation, that one I agree with. Though I wouldn't use lockless in the general sense, but rather divide physical memory for each core and let each core do their own PMEM allocation. If one runs out, rebalance, which should be rare and it's ok to take a tiny hit when that happens, instead of a tiny hit with every allocation.
Re: Why SMP?
That means you need to validate all userdata that is passed to kernel for presence, which results in poor performance. I rely on the page fault handler in those situations instead.LtG wrote:What do you mean "everywhere"? As in kernel? That's easy, don't let the kernel use demand paged memory.rdos wrote:I would recommend the use of lock-free physical memory allocation. It's not easy, but since pagefaults that potentially will need physical memory allocation can happen almost everywhere (unless you use extensive protective methods), this will make that code a lot easier in multicore systems.
My HRT data acquisition tool might allocate 100GB on a 128GB machine.LtG wrote:As for lock-free physical memory allocation, that one I agree with. Though I wouldn't use lockless in the general sense, but rather divide physical memory for each core and let each core do their own PMEM allocation. If one runs out, rebalance, which should be rare and it's ok to take a tiny hit when that happens, instead of a tiny hit with every allocation.
Re: Why SMP?
Interesting thread. I shall have to brush up on spinlocks.
What happens when threads/processes want to share memory? Do other cores get to look into the memory owned by the core where the shared region was allocated?LtG wrote:...divide physical memory for each core and let each core do their own PMEM allocation. If one runs out, rebalance, which should be rare and it's ok to take a tiny hit when that happens, instead of a tiny hit with every allocation.
Kaph — a modular OS intended to be easy and fun to administer and code for.
"May wisdom, fun, and the greater good shine forth in all your work." — Leo Brodie
"May wisdom, fun, and the greater good shine forth in all your work." — Leo Brodie
Re: Why SMP?
Do you mean syscall where the argument is a struct pointing to ten different places in memory, and thus you would have to validate presence to ensure page fault not triggered in kernel land?rdos wrote: That means you need to validate all userdata that is passed to kernel for presence, which results in poor performance. I rely on the page fault handler in those situations instead.
If so, then that's not an issue for me, as I don't do complex kernel operations. For those that need such structs to be passed to kernel then that would be a problem.
Shouldn't be a problem =)rdos wrote:My HRT data acquisition tool might allocate 100GB on a 128GB machine.LtG wrote:As for lock-free physical memory allocation, that one I agree with. Though I wouldn't use lockless in the general sense, but rather divide physical memory for each core and let each core do their own PMEM allocation. If one runs out, rebalance, which should be rare and it's ok to take a tiny hit when that happens, instead of a tiny hit with every allocation.
Re: Why SMP?
My proposal doesn't change any of that. I'm talking about ownership of the region of memory while it's owned by the kernel, so when it's free (or disk cache, which is pretty much the same thing).eekee wrote:What happens when threads/processes want to share memory? Do other cores get to look into the memory owned by the core where the shared region was allocated?LtG wrote:...divide physical memory for each core and let each core do their own PMEM allocation. If one runs out, rebalance, which should be rare and it's ok to take a tiny hit when that happens, instead of a tiny hit with every allocation.
So when PMEM is allocated each core can simply consult their own free PMEM list, this is the happy path. Only when a core runs out of it's own free PMEM list does it need to do synchronization, so it can "steal" free PMEM from other cores. It doesn't need to do full re-balancing either, it can just steal from one core.
I go a bit further, I treat each core separately, each having their own "kernel", instead of one kernel being called in all cores. The free PMEM list is essentially just a consequence of that.
Re: Why SMP?
No, I've forbidden both structs and enums in syscalls.LtG wrote: Do you mean syscall where the argument is a struct pointing to ten different places in memory, and thus you would have to validate presence to ensure page fault not triggered in kernel land?
If so, then that's not an issue for me, as I don't do complex kernel operations. For those that need such structs to be passed to kernel then that would be a problem.
I also forbid the implentation of ioctl.
Still, if you pass a large string or data buffer to something like a filesystem driver it will need to copy it to or from some kernel buffer, and that operation could trigger demand paging. Or if you pass audio-data to the audio driver. Of course, if you want to pass it directly to a buffer ring of some device, then you will need to get the physical pages anyway, and so validation will be relatively cheap.
Re: Why SMP?
Exactly, plus one of the bugs that are easier to catch on single cpu with spinlocks is then ring3 does syscall that accesses shared with device driver buffer and all of the sudden interrupt comes thats "exits" and does "sti" to do heavy task of populating shared buffer with events - and it waits until syscall releasses the lock.rdos wrote: The only really significant difference between a SMP core and a single core OS is that in a SMP core you must use spinlocks instead of cli/sti for synchronization. So, if you put spinlocks in your code from start instead of sti/cli, then the move to SMP will be much easier..
On multiple CPUs this could simply manifest itself at slowdown that are harder and harder to catch on modern super fast CPUs. (single x64 core is like 5-10 times faster than in year 2001 i think)