Yes, RDOS was designed for multitasking in the kernel from the beginning, and already had server-threads supporting various hardware devices. Another design philosophy that helped was to keep ISRs at minimum, and delegating as much of the work as possible to server-threads. That's why SMP mostly affected the synchronization between ISRs and the server-thread. There also was a suitable synchronization interface with critical sections and signals that was used throught the kernel and it's device-drivers that could be ported to SMP with no change in the visible interface. The hardest issue actually was self-modifying code (syscalls), and providing page-fault handlers for applications that could handle SMP.gerryg400 wrote:BTW2, the reason Rdos could simply search and replace the cli instructions with spinlocks in his OS was because his OS was (probably) already re-entrant and supported multiple interrupts etc. already. Thus it needed fewer changes to be SMP supporting. Is that correct Rdos ?
Reentrant and interruptable kernel theory question
Re: Reentrant and interruptable kernel theory question
Re: Reentrant and interruptable kernel theory question
Yes, but for a gradual move, these issues are not critical, but can be implemented as needed. What is critical (and pretty hard) is to switch the OS to multicore and actually make it work without crashing too much. Until you have passed that threshold, your OS would be unstable on SMP. Even if you delete your kernel, and start from scratch, you will have a state where your OS is either unusable or unstable. The current state of RDOS is that it is stable on single core, but slightly unstable on multicore. That is no problem since the commercial application running on RDOS is running on a single core PC. I decided to provide different locking schemes on single core vs multicore, so at boot time the scheduler would select the correct one. After all, the more complex locking of multicore should not slow-down single core machines. On single core, the locking is essentially cli/sti.Brendan wrote:The remaining 80% is replacing code that works on multi-CPU but has "exponential suckage" (poor scalability). This means researching lockless algorithms for IPC, splitting "one lock for physical memory management" into "thousands of locks", finding ways to minimise the work done in critical sections, finding ways to minimise IPIs (for TLB shootdown, etc), figuring out how to avoid a "global tick" while keeping each CPU's time in sync, finding things that idle CPUs can do to improve performance in future (when the system is under load), CPU load balancing (and a whole pile of IO prioritisation if you don't have it already), cache management (reducing false sharing), etc.
Re: Reentrant and interruptable kernel theory question
Hi,
The paper is called "How to NOT write kernel driver" (and is written by one of Red Hat's kernel hackers, Arjan van de Ven). The specific piece I was thinking of is at the start of Chapter 4 (SMP):
Cheers,
Brendan
I've been trying to find something in an article I saw somewhere... and I've found it.rdos wrote:Yes, but for a gradual move, these issues are not critical, but can be implemented as needed. What is critical (and pretty hard) is to switch the OS to multicore and actually make it work without crashing too much. Until you have passed that threshold, your OS would be unstable on SMP. Even if you delete your kernel, and start from scratch, you will have a state where your OS is either unusable or unstable. The current state of RDOS is that it is stable on single core, but slightly unstable on multicore.
The paper is called "How to NOT write kernel driver" (and is written by one of Red Hat's kernel hackers, Arjan van de Ven). The specific piece I was thinking of is at the start of Chapter 4 (SMP):
Despite popular belief, SMP safety is not something you “weld” into your code as a hindsight. SMP
safety is something you need to take into account right from the start. Makeing a (largish) piece of
code SMP safe in hindsight leads to all kinds of lock ordering nightmares, makes you wish there were re-
cursive locks and generally results in a suboptimal solution. While I could give numerous drivers as
examples of how to not do it, the main kernel with the Big Kernel Lock (BKL) is the best example of
this. The BKL was put in to make the kernel work on SMP, in hindsight, and well, it still results in
nightmares with dozens and dozens of races. It is taking 5 years so far to fix all the core subsystems
to have proper locking of their own.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Reentrant and interruptable kernel theory question
If you are starting to write new code, I agree that should consider things like that from the beginning. However, the situation looks different when you already have a big pile of code and your options are using a BKL style approach or rewriting everything from scratch. I think there's a reason Linux did the BKL thing. Going through that kind of pain certainly sucks, but the cost and pain of rewriting everything from scratch would have been much higher. If you have existing code, in most cases you want an incremental approach.
Re: Reentrant and interruptable kernel theory question
Hi,
None of this really applies to hobbyist developers like ourselves though - you don't need to worry about losing volunteers or market share when you have no volunteers and no market share to lose.
In addition, for hobbyist developers like ourselves, it's rarely a case of "restart to add SMP support alone". More often it'd be "restart to add SMP support, and improve lots of other things at the same time" or even just "restart to improve lots of other things at the same time (and add SMP too)".
Cheers,
Brendan
I personally think that Linux did the whole BKL thing because they didn't have the benefit of hindsight. However, even with hindsight it may not have been the best approach for Linux, as its developers are cooperating but independent volunteers and not people that can be controlled by a benevolent dictator. Basically the chance of forks and the chance of people (volunteers/developers and end-users) moving to different projects instead (e.g. FreeBSD) may have justified the increased pain of retrofitting SMP to existing code.Kevin wrote:If you are starting to write new code, I agree that should consider things like that from the beginning. However, the situation looks different when you already have a big pile of code and your options are using a BKL style approach or rewriting everything from scratch. I think there's a reason Linux did the BKL thing. Going through that kind of pain certainly sucks, but the cost and pain of rewriting everything from scratch would have been much higher. If you have existing code, in most cases you want an incremental approach.
None of this really applies to hobbyist developers like ourselves though - you don't need to worry about losing volunteers or market share when you have no volunteers and no market share to lose.
In addition, for hobbyist developers like ourselves, it's rarely a case of "restart to add SMP support alone". More often it'd be "restart to add SMP support, and improve lots of other things at the same time" or even just "restart to improve lots of other things at the same time (and add SMP too)".
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Reentrant and interruptable kernel theory question
I can imagine that any type of kernel that is maintained by "ordinary programmers / users" will suffer from SMP related problems because so many of these are used to writing sequential application code with no locks. If the people that write device-drivers are unaware of how to do proper locking, and how different things interact, they will write buggy device-drivers. In that respect having a kernel that is designed for SMP won't help, as every part of the system needs to be designed with SMP in mind. The issue for kernel itself is only to hand-out a working scheduler and synchronization primitives, it cannot enforce the correct usage of these in specific drivers.
I'm also amazed that Linux still has a BKL, and that it cannot handle floating-point usage in kernel.
I'm also amazed that Linux still has a BKL, and that it cannot handle floating-point usage in kernel.
Re: Reentrant and interruptable kernel theory question
Hi,
The big kernel lock was finally removed in Linux version 2.6.39, in May this year. Basically, Linux was only about 5 years old when they added the big kernel lock (1991 to 1995), and it took about 15 years to get rid of the big kernel lock (1995 to 2011). Of course they did work on a few other things during the last 15 years too..
Cheers,
Brendan
I'm not too sure when that paper was written. It says "It is taking 5 years so far to fix all the core subsystems to have proper locking of their own.", and if I remember right SMP (and the big kernel lock) was added to Linux in around 1995. That means this paper would've been written in around the year 2000, but the legalese at the end shows an IBM copyright statement from 2001. Therefore I'm going to assume this paper was written in late 2001.rdos wrote:I'm also amazed that Linux still has a BKL, and that it cannot handle floating-point usage in kernel.
The big kernel lock was finally removed in Linux version 2.6.39, in May this year. Basically, Linux was only about 5 years old when they added the big kernel lock (1991 to 1995), and it took about 15 years to get rid of the big kernel lock (1995 to 2011). Of course they did work on a few other things during the last 15 years too..
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Reentrant and interruptable kernel theory question
On the contrary: Something like Linux would be hurt if all volunteers left, but it would survive because there are companies interested in it.Brendan wrote:None of this really applies to hobbyist developers like ourselves though - you don't need to worry about losing volunteers or market share when you have no volunteers and no market share to lose.
Our hobby OSes only exist because there is at least one volunteer. There are a couple of projects that are done by a team rather than a single person, but these teams are usually relatively small. And this means that a small hobby project cannot afford to lose any volunteer.
Yeah, been there, done that (not about SMP, but the lots of other things part). I wouldn't do it again.In addition, for hobbyist developers like ourselves, it's rarely a case of "restart to add SMP support alone". More often it'd be "restart to add SMP support, and improve lots of other things at the same time" or even just "restart to improve lots of other things at the same time (and add SMP too)".
While you're rewriting stuff from scratch, development comes to a halt. You're only working on getting things running that already worked on the old version. This, plus the fact that it's a really big rewrite so that no end is in sight, make developers lose their motivation. You cannot present anything new, so you lose testers. The longer it takes until you're back to "normal" development, the less motivation gets, the slower progress becomes. Which makes it take even longer.
There hasn't been a tyndur release for almost two years now, guess why.
Re: Reentrant and interruptable kernel theory question
Well, basically the situation for me is that I suddenly (and unexpectedly) have a fair amount of time on my hands so I'm dusting off toy OS code I wrote about 10 years ago to play with. I decided that it was truly terrible so I started again from scratch and decided that the right way to do it would be to design for MP from the ground up. I'm still learning about ACPI and the like (have a LOT of reading still to do) so I'm still not fully aware if an ISR can be running on two cores simultaneously and that sort of thing.
To be honest I'm still monkeying around with the memory manager until I find a scheme I'm happy with. Right now I'm using a bitmap to track memory allocations and parsing the paging tables to find physical address for a page but this doesn't include reference counting and the like so I need to rethink that some more.
Any and all advice is greatly appreciated as this is very much a 100% new topic for me
Thanks!
Mike
To be honest I'm still monkeying around with the memory manager until I find a scheme I'm happy with. Right now I'm using a bitmap to track memory allocations and parsing the paging tables to find physical address for a page but this doesn't include reference counting and the like so I need to rethink that some more.
Any and all advice is greatly appreciated as this is very much a 100% new topic for me
Thanks!
Mike