Rewrite from Scratch
Re: Rewrite from Scratch
It is quite interesting that I have forgot how much work I actually have done. I have felt that "Oh, there is no much code. Just do this, this, and this. Then you have to take care of that and that. Oh and don't forget to do that because it is required for this." To put it shortly: I sometimes underestimate the amount of work that is needed to get even this far. Nevertheless, I have not got very far at all. This is just the beginning and not even worth mentioning because everyone here has done it already.
Re: Rewrite from Scratch
Hi,
Now imagine that your OS design has changed. Assume that the existing design has a big kernel lock and all the locking needs to be changed, nothing was designed for scalability and all the data structured need to be changed, and there's no NUMA optimisations or page colouring and all the physical memory management needs to change; and you want to change all that at the same time to avoid updating everything multiple times. You mark all the files (as above) then you try to start with the physical memory manager and realise you can't because changing the interface would break the virtual memory manager and it'd require changes to locking and the scheduler. Then you decide you'll start with the virtual memory manager and realise it relies too much on the physical memory manager and scheduler. Then you decide you'll start with the scheduler and realise that too many different things depend on it. You go around and around in circles trying to find a place where you can actually start doing something, but you don't get much more than small/trivial changes done because everything relies on everything else and the OS would be in an broken state where you can't test any of your changes. These are the can't update <foo> because of <bar>" problems where it's a lot easier to start from scratch where there's no dependencies between old and new code getting in the way.
Now (from my perspective) everything you've said so far points towards "mark and sweep", with 2 exceptions. The exceptions are:
Now think about how that "totalFreePages" variable actually works. For multiple CPUs, every time any CPU allocates or frees a page you have to atomically update the variable - e.g. something like "lock add [freePages],something". Every time a CPU modifies this variable it has to tell all other CPUs to invalidate the old/stale cache line in their caches, and all the other CPUs get a cache miss when they try to access the variable again (and have to fetch the new/modified version of the cache line). This is a cache thrashing disaster.
How do you fix the cache thrashing disaster?
Let's assume that to make the physical memory manager scale properly (so that several CPUs can allocate/free pages at the same time without lock contention) you implemented a set of 1024 free page stacks where each free page stack has it's own lock. To fix the "cache thrashing disaster" caused by tracking the current number of free pages, you could have a different "number of free pages for this free page stack" variable for each free page stack instead of one global variable, and make sure that the "freePages" variable is in the same cache line as the free page stack's lock (so that when a CPU acquires the lock it fetches the cache line and there won't be additional cache miss/thrashing caused by updating the "freePages" variable). Now you've broken all the code that used the old "totalFreePages" global variable (if it wasn't already broken when you changed it from something reliable into a mere hint). To fix that you might implement a "getTotalFreePages()" function that calculates the sum of all the "free pages for this stack" variables. You read the first free page stack's "free pages" value, but that value may change while you're reading the second free page stack's "free pages" value, and so on (until you've calculated a sum of potentially stale values).
If something as simple as a "totalFreePages" variable can be such a mess, can you imagine what it'll be like to update something complex like a scheduler?
Back to your original problem. If you want your OS to be good when there's many CPUs (e.g. rather than only having decent performance for a low number of CPUs), then you need to develop scalable algorithms. If you don't have experience trying to make things scale, then rewriting from scratch would be a bad idea - you'd gain experience, knowledge and confidence writing the new version, and maybe in 6 months time you'd be wondering if you should use the experience and knowledge you gained to make a better "version n+1" from scratch. The other thing is that once you start seriously looking at "many CPUs" there's a natural tendency to start considering NUMA optimisations, and NUMA optimisations can cause radical changes in memory management and scheduling (e.g. "per NUMA domain" data structures, etc to ensure things you need to access are "close" to the CPU/s you access them from). The alternative is to use your existing OS as a way to gain experience designing/writing scalable algorithms, and start again from scratch after you've gained the experience/knowledge (to avoid a second rewrite).
I don't know what your OS design is currently like, or what you're hoping it will become in future. However, I would seriously consider "rewrite from scratch, but not yet".
Cheers,
Brendan
This is the "in place, mark and sweep" approach. For example, for the "mark" phase you could put a big "NOT CHECKED YET" comment at the top of every file of source code and add an (empty) "UNCHECKED.TXT" file in every directory. Then, for the "sweep" phase you'd go through each file updating it and remove the "NOT CHECKED YET" comment when it's checked (and delete the "UNCHECKED.TXT" file in the directory when all files in that directory have been checked). When all of the "UNCHECKED.TXT" files have been removed you're finished.Antti wrote:It looks like that I will not do this rewriting from scratch. However, I am probably going to do a very simple thing that has almost the same benefit you mentioned:Brendan wrote:The main benefit of starting again is that it's easy to keep track of what has been updated and what hasn't; without having "can't update <foo> because of <bar>" problems that you'd have with an "in place, mark and sweep" approach.
Of course it is not needed to have a new folder at all. I could have some tag for files that are checked. The details are not very important here. The main point is to systematically check all the source code files.Code: Select all
1: Create an empty folder 2: Take a source code file from the current code base 3: jnotfound 7f 4: Check the source file very carefully 5: Copy it to the new folder 6: jmp 2b 7: ret
Now imagine that your OS design has changed. Assume that the existing design has a big kernel lock and all the locking needs to be changed, nothing was designed for scalability and all the data structured need to be changed, and there's no NUMA optimisations or page colouring and all the physical memory management needs to change; and you want to change all that at the same time to avoid updating everything multiple times. You mark all the files (as above) then you try to start with the physical memory manager and realise you can't because changing the interface would break the virtual memory manager and it'd require changes to locking and the scheduler. Then you decide you'll start with the virtual memory manager and realise it relies too much on the physical memory manager and scheduler. Then you decide you'll start with the scheduler and realise that too many different things depend on it. You go around and around in circles trying to find a place where you can actually start doing something, but you don't get much more than small/trivial changes done because everything relies on everything else and the OS would be in an broken state where you can't test any of your changes. These are the can't update <foo> because of <bar>" problems where it's a lot easier to start from scratch where there's no dependencies between old and new code getting in the way.
Now (from my perspective) everything you've said so far points towards "mark and sweep", with 2 exceptions. The exceptions are:
Antti wrote:I could handle race conditions (the biggest problem) everywhere from the very beginning.
Let's look at something that seems very simple. Assume you've got a global variable to keep track of the current number of free physical pages, and some code like "temp = totalFreePages; if(temp > 10) { alloc_a_page_for_disk_cache(); }". With a big kernel lock this is fine. Imagine if you redesign the physical memory manager for scalability, so that several CPUs can allocate/free pages at the same time without lock contention. Now (possibly without realising it) all the code that relied on "totalFreePages" is potentially broken - it might read the current number of free pages, then other CPUs might allocate 10 pages before before it can do anything with the value it read. Basically, the "totalFreePages" variable becomes a hint rather than something accurate that can be relied on.Antti wrote:The global lock is used and it is very inefficient because it means that the kernel code cannot be interrupted.
Now think about how that "totalFreePages" variable actually works. For multiple CPUs, every time any CPU allocates or frees a page you have to atomically update the variable - e.g. something like "lock add [freePages],something". Every time a CPU modifies this variable it has to tell all other CPUs to invalidate the old/stale cache line in their caches, and all the other CPUs get a cache miss when they try to access the variable again (and have to fetch the new/modified version of the cache line). This is a cache thrashing disaster.
How do you fix the cache thrashing disaster?
Let's assume that to make the physical memory manager scale properly (so that several CPUs can allocate/free pages at the same time without lock contention) you implemented a set of 1024 free page stacks where each free page stack has it's own lock. To fix the "cache thrashing disaster" caused by tracking the current number of free pages, you could have a different "number of free pages for this free page stack" variable for each free page stack instead of one global variable, and make sure that the "freePages" variable is in the same cache line as the free page stack's lock (so that when a CPU acquires the lock it fetches the cache line and there won't be additional cache miss/thrashing caused by updating the "freePages" variable). Now you've broken all the code that used the old "totalFreePages" global variable (if it wasn't already broken when you changed it from something reliable into a mere hint). To fix that you might implement a "getTotalFreePages()" function that calculates the sum of all the "free pages for this stack" variables. You read the first free page stack's "free pages" value, but that value may change while you're reading the second free page stack's "free pages" value, and so on (until you've calculated a sum of potentially stale values).
If something as simple as a "totalFreePages" variable can be such a mess, can you imagine what it'll be like to update something complex like a scheduler?
Back to your original problem. If you want your OS to be good when there's many CPUs (e.g. rather than only having decent performance for a low number of CPUs), then you need to develop scalable algorithms. If you don't have experience trying to make things scale, then rewriting from scratch would be a bad idea - you'd gain experience, knowledge and confidence writing the new version, and maybe in 6 months time you'd be wondering if you should use the experience and knowledge you gained to make a better "version n+1" from scratch. The other thing is that once you start seriously looking at "many CPUs" there's a natural tendency to start considering NUMA optimisations, and NUMA optimisations can cause radical changes in memory management and scheduling (e.g. "per NUMA domain" data structures, etc to ensure things you need to access are "close" to the CPU/s you access them from). The alternative is to use your existing OS as a way to gain experience designing/writing scalable algorithms, and start again from scratch after you've gained the experience/knowledge (to avoid a second rewrite).
I don't know what your OS design is currently like, or what you're hoping it will become in future. However, I would seriously consider "rewrite from scratch, but not yet".
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Rewrite from Scratch
Antti, before you consider giving up your BKL and moving towards fine-grained-kernel-locking and a pre-emptable, re-entrant, concurrent SMP kernel you should consider how much more difficult and complex this type of kernel is than a simple BKL hobby kernel and whether you are really interested in the headache and heartache.
I'd be willing to suggest perhaps 2x in size and more than 10x in complexity. Does anyone else have an estimate based on their work ?
I'd be willing to suggest perhaps 2x in size and more than 10x in complexity. Does anyone else have an estimate based on their work ?
If a trainstation is where trains stop, what is a workstation ?
Re: Rewrite from Scratch
You don't. You want to go in small incremental steps, so that when (not if) you introduce bugs in the new implementation you can easily track them down as soon as you noticed them and have a reproducer. Of course, this only works for people using version control, but anyone who has a non-trivial project not using version control doesn't deserve otherwise.Brendan wrote:Assume that the existing design has a big kernel lock and all the locking needs to be changed, nothing was designed for scalability and all the data structured need to be changed, and there's no NUMA optimisations or page colouring and all the physical memory management needs to change; and you want to change all that at the same time to avoid updating everything multiple times.
I have yet to see a problem in practice where you can't go in small incremental steps.
Now you're implying that Antti is stupid enough to remove his BKL without having proper replacement, which I find rather offending. Of course, even if you know that you need some synchronisation, everyone makes mistakes occasionally - this is why you go in small incremental steps so that narrowing down problems becomes easy.With a big kernel lock this is fine. Imagine if you redesign the physical memory manager for scalability, so that several CPUs can allocate/free pages at the same time without lock contention. Now (possibly without realising it) all the code that relied on "totalFreePages" is potentially broken - it might read the current number of free pages, then other CPUs might allocate 10 pages before before it can do anything with the value it read. Basically, the "totalFreePages" variable becomes a hint rather than something accurate that can be relied on.
When you remove a BKL, you don't just delete all of the lines that lock/unlock the BKL, but you rewrite your code incrementally to use other means of synchronisation while pushing the lock/unlock of the BKL further down in the call chain. Incrementally, in small steps, as always.
Re: Rewrite from Scratch
I don't think there is a lot more complexity in making a pre-emptable SMP kernel. The issue is that you must understand the concepts, and use them throughout your code. You shouldn't code synchronization primitives everywhere you use them, rather implement some generic types like spinlocks, mutex and possibly some event-related type as well. Then the only difference really is extra calls to these synchronization types which would typically be a single line of code.gerryg400 wrote:Antti, before you consider giving up your BKL and moving towards fine-grained-kernel-locking and a pre-emptable, re-entrant, concurrent SMP kernel you should consider how much more difficult and complex this type of kernel is than a simple BKL hobby kernel and whether you are really interested in the headache and heartache.
I'd be willing to suggest perhaps 2x in size and more than 10x in complexity. Does anyone else have an estimate based on their work ?
OTOH, the scheduler itself is a lot more complex, but not necesarily other code.
I wouldn't bother with optimizing fine-grained locking to start with. This could be changed later. Just give each resource its own lock, rather than using a global lock.
Re: Rewrite from Scratch
I agree. The biggest issue with major rewrites is that you can localize bugs where they occur. In order to do that, you need both version control and incremental changes. Otherwise you end up with the whole kernel as the possible origin of each and every bug.Kevin wrote:You don't. You want to go in small incremental steps, so that when (not if) you introduce bugs in the new implementation you can easily track them down as soon as you noticed them and have a reproducer. Of course, this only works for people using version control, but anyone who has a non-trivial project not using version control doesn't deserve otherwise.
I have yet to see a problem in practice where you can't go in small incremental steps.
Re: Rewrite from Scratch
No, you don't. You want to change these things one at a time to avoid introducing bugs that cannot be bound to changes. You also want to use version control rigorously, as well as thorough testing of each step to discover bugs when they occur.Brendan wrote:Now imagine that your OS design has changed. Assume that the existing design has a big kernel lock and all the locking needs to be changed, nothing was designed for scalability and all the data structured need to be changed, and there's no NUMA optimisations or page colouring and all the physical memory management needs to change; and you want to change all that at the same time to avoid updating everything multiple times.
In my experience, there is always a way to incrementally go between arbitrary states. Sometimes it is not obvious how to achieve it, and you might need to break-down the procedure into multiple, intermediate steps, but it is always possible to do. And if it is truely impossible, then I might also recommend the full rewrite, but then the design is so bad that it is necesary because of improper dependence. At least, I've not encountered this in my OS, and I've rewritten the scheduler twice (at least).Brendan wrote:You mark all the files (as above) then you try to start with the physical memory manager and realise you can't because changing the interface would break the virtual memory manager and it'd require changes to locking and the scheduler. Then you decide you'll start with the virtual memory manager and realise it relies too much on the physical memory manager and scheduler. Then you decide you'll start with the scheduler and realise that too many different things depend on it. You go around and around in circles trying to find a place where you can actually start doing something, but you don't get much more than small/trivial changes done because everything relies on everything else and the OS would be in an broken state where you can't test any of your changes. These are the can't update <foo> because of <bar>" problems where it's a lot easier to start from scratch where there's no dependencies between old and new code getting in the way.
This should be a function, not a variable. If it is a variable, you rename the variable, insert a function, and recompile. Then you fix all the compilation errors by inserting the function.Brendan wrote:Let's look at something that seems very simple. Assume you've got a global variable to keep track of the current number of free physical pages, and some code like "temp = totalFreePages; if(temp > 10) { alloc_a_page_for_disk_cache(); }". With a big kernel lock this is fine. Imagine if you redesign the physical memory manager for scalability, so that several CPUs can allocate/free pages at the same time without lock contention. Now (possibly without realising it) all the code that relied on "totalFreePages" is potentially broken - it might read the current number of free pages, then other CPUs might allocate 10 pages before before it can do anything with the value it read. Basically, the "totalFreePages" variable becomes a hint rather than something accurate that can be relied on.
Re: Rewrite from Scratch
Hi,
Note: The paragraph above does not imply that I think Antti does or doesn't need to make a large amount of changes (I honestly don't know), and does not imply that I think Antti is too scared of making bugs to be a competent developer (I honestly doubt this). The paragraph above has nothing to do with Antii at all, and is simply a response to Kevin's comment. I am also not suggesting that Kevin wets the bed occasionally because he has nightmares about trivial compile time errors. That would be silly.
My "totalFreePages" example only highlights some of the problems involved in making code that wasn't originally designed for scalability suitable for a "many CPU" scenario. It doesn't matter if the (fictitious/hypothetical) code was designed for single-CPU (and never had any lock/s of any kind). It doesn't matter if the (fictitious/hypothetical) code used a big kernel lock, or if the big kernel lock is removed or not. The same problems would exist regardless.
Cheers,
Brendan
Whether it's possible to go in small incremental steps (like a baby afraid of the dark) or not is irrelevant. Would you want to rewrite 90% of the code once, or would you want to rewrite 25% of the code 10 times? Are you saying that, if/when a large amount of code needs many changes, you simply don't have the courage needed to avoid a lot of wasted effort?Kevin wrote:You don't. You want to go in small incremental steps, so that when (not if) you introduce bugs in the new implementation you can easily track them down as soon as you noticed them and have a reproducer. Of course, this only works for people using version control, but anyone who has a non-trivial project not using version control doesn't deserve otherwise.Brendan wrote:Assume that the existing design has a big kernel lock and all the locking needs to be changed, nothing was designed for scalability and all the data structured need to be changed, and there's no NUMA optimisations or page colouring and all the physical memory management needs to change; and you want to change all that at the same time to avoid updating everything multiple times.
I have yet to see a problem in practice where you can't go in small incremental steps.
Note: The paragraph above does not imply that I think Antti does or doesn't need to make a large amount of changes (I honestly don't know), and does not imply that I think Antti is too scared of making bugs to be a competent developer (I honestly doubt this). The paragraph above has nothing to do with Antii at all, and is simply a response to Kevin's comment. I am also not suggesting that Kevin wets the bed occasionally because he has nightmares about trivial compile time errors. That would be silly.
I'm sorry that you've misinterpreted my words in this way. Unfortunately, even after reading my words several times, I can't see where your failure originates.Kevin wrote:Now you're implying that Antti is stupid enough to remove his BKL without having proper replacement, which I find rather offending. Of course, even if you know that you need some synchronisation, everyone makes mistakes occasionally - this is why you go in small incremental steps so that narrowing down problems becomes easy.With a big kernel lock this is fine. Imagine if you redesign the physical memory manager for scalability, so that several CPUs can allocate/free pages at the same time without lock contention. Now (possibly without realising it) all the code that relied on "totalFreePages" is potentially broken - it might read the current number of free pages, then other CPUs might allocate 10 pages before before it can do anything with the value it read. Basically, the "totalFreePages" variable becomes a hint rather than something accurate that can be relied on.
My "totalFreePages" example only highlights some of the problems involved in making code that wasn't originally designed for scalability suitable for a "many CPU" scenario. It doesn't matter if the (fictitious/hypothetical) code was designed for single-CPU (and never had any lock/s of any kind). It doesn't matter if the (fictitious/hypothetical) code used a big kernel lock, or if the big kernel lock is removed or not. The same problems would exist regardless.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Rewrite from Scratch
I don't think that is the issue. Most of the time in a major rewrite is not to write the code, but to find the bugs. So even if the full rewrite might be faster (coding-wise), finding each bug in the full rewrite scenario might take 10 times as long because you have no history of what have changed. But, then, if you never make mistakes, the full rewrite will always be the best option.Brendan wrote:Whether it's possible to go in small incremental steps (like a baby afraid of the dark) or not is irrelevant. Would you want to rewrite 90% of the code once, or would you want to rewrite 25% of the code 10 times? Are you saying that, if/when a large amount of code needs many changes, you simply don't have the courage needed to avoid a lot of wasted effort?
Re: Rewrite from Scratch
Correct. I think I'd go for touching 25% of the code ten times, precisely in order to avoid wasted effort. This may seem contradictory at the first sight, but the ability to check any intermediate state between the original and the final rewritten version is so valuable for (amongst others) debugging and keeping the whole thing stable that I would likely accept a factor of three (writing 25 * 10% = 250% vs. 90% of the code) for the implementation effort, because I know it will pay off later. And I'm not sure that I even agree with this factor, because I believe that each time you touch the 25% you do fewer changes in this part of the code than when you rewrite the 90%.
Also at least when you're working in a team another thing to consider is that a rewrite brings almost any other efforts to a halt while the rewrite hasn't been completed yet. It can be relevant for a one-man project as well: When you don't feel motivated to continue with the design change on one day, you could still work on another part of your OS - except if it is totally broken because you're somewhere in the middle of a rewrite.
Also at least when you're working in a team another thing to consider is that a rewrite brings almost any other efforts to a halt while the rewrite hasn't been completed yet. It can be relevant for a one-man project as well: When you don't feel motivated to continue with the design change on one day, you could still work on another part of your OS - except if it is totally broken because you're somewhere in the middle of a rewrite.
Re: Rewrite from Scratch
I am clearly not ready for this "next step" yet. As a matter of fact, I am now wondering whether my current big kernel lock based implementation is robust enough. All the things that are somehow related to the caches need to be studied more in-depth. Is the following pseudocode broken:
Yes, I know it has a busy loop wait. Like I said, this should be considered as a pseudocode. But the question is: do I have to adjust MTRRs (I have not touched them yet)? I am quite not sure whether those "PCD and PWT bits" are sufficient here. Are there anything else that I have forgot to take into account?
When it comes to the actual topic, it seems that my "rewrite from scratch" means: just make source code files look nicer.
Code: Select all
/* Critical section (enter and leave) */
/* This data page has 'PCD' and 'PWT' bits set */
.data
.balign 8, 0
lockvariable: .8byte 0
.text
.globl critical_section_enter
.globl critical_section_leave
.balign 16, 0
critical_section_enter:
lock btsq $0, lockvariable(%rip) /* Bit Test and Set (atomically) */
jc critical_section_enter /* Locked already? If yes, try again. */
retq
.balign 16, 0
critical_section_leave:
lock btrq $0, lockvariable(%rip) /* Reset (atomically) */
retq
When it comes to the actual topic, it seems that my "rewrite from scratch" means: just make source code files look nicer.
Re: Rewrite from Scratch
There surely is a huge gap between your knowledge level and mine. However, I am climbing up everyday. I started programming (in general) in early 2010 and I started my OS project ten months ago. I already have a simple basic system that is not anyhow based on any "tutorial operating systems." Here is a brief list of "keywords" (just to give some idea):
x86-64 PCs supported
- I/O APIC is required
- "PS/2 Keyboard" is required
Custom-made bootloaders
- Legacy BIOS (USB or CD)
- UEFI (USB or CD)
- External tools are not used for making the bootable images
Graphics
- 800x600 of higher (32 bit pixel)
- Linear Framebuffer
- VESA BIOS Extensions (VBE)
- UEFI Graphics Output Protocol
Console
- Fonts
- Printing
- Scrolling
Kernel
- Monolithic
- Higher Half
- 2 MB page size used (!)
- Ring 3 (for User Space)
Task handling
- Statically allocated memory for "task structures"
- This memory has to be physically continuous
- CR3 always points to the beginning of current task structure
- CR3 value also identifies the task
- Like a vector
- Simple (!)
Scheduler
- Round Robin
- APIC timer
"File system"
- Image is loaded by the bootloader
- Lives fully in RAM
SMP "support"
- Start all the other CPUs (cores)
- Trampoline for entering Long Mode and jumping to the high kernel space
- Set the CPU to the known state
- Give some sign
- Halt loop
This whole system compiles easily with a "standard gcc + nasm." I used nasm for the legacy bios boot loader but since then I have only used GNU as. The full build outputs two files: release\System.img and release\System.iso. I have made a couple of tools written in C that are used when building the system. They belong to the same code base and are automatically compiled and run during the build.
x86-64 PCs supported
- I/O APIC is required
- "PS/2 Keyboard" is required
Custom-made bootloaders
- Legacy BIOS (USB or CD)
- UEFI (USB or CD)
- External tools are not used for making the bootable images
Graphics
- 800x600 of higher (32 bit pixel)
- Linear Framebuffer
- VESA BIOS Extensions (VBE)
- UEFI Graphics Output Protocol
Console
- Fonts
- Printing
- Scrolling
Kernel
- Monolithic
- Higher Half
- 2 MB page size used (!)
- Ring 3 (for User Space)
Task handling
- Statically allocated memory for "task structures"
- This memory has to be physically continuous
- CR3 always points to the beginning of current task structure
- CR3 value also identifies the task
- Like a vector
- Simple (!)
Scheduler
- Round Robin
- APIC timer
"File system"
- Image is loaded by the bootloader
- Lives fully in RAM
SMP "support"
- Start all the other CPUs (cores)
- Trampoline for entering Long Mode and jumping to the high kernel space
- Set the CPU to the known state
- Give some sign
- Halt loop
This whole system compiles easily with a "standard gcc + nasm." I used nasm for the legacy bios boot loader but since then I have only used GNU as. The full build outputs two files: release\System.img and release\System.iso. I have made a couple of tools written in C that are used when building the system. They belong to the same code base and are automatically compiled and run during the build.
Re: Rewrite from Scratch
Hi,
Cheers,
Brendan
Wrong. The number of bugs is approximately proportional to the number of lines of code you write. If you write twice as much code because you're doing incremental fixes, then you end up writing and fixing twice as many bugs; even if half the code (and half the bugs you fixed) ends up being replaced.rdos wrote:I don't think that is the issue. Most of the time in a major rewrite is not to write the code, but to find the bugs. So even if the full rewrite might be faster (coding-wise), finding each bug in the full rewrite scenario might take 10 times as long because you have no history of what have changed. But, then, if you never make mistakes, the full rewrite will always be the best option.Brendan wrote:Whether it's possible to go in small incremental steps (like a baby afraid of the dark) or not is irrelevant. Would you want to rewrite 90% of the code once, or would you want to rewrite 25% of the code 10 times? Are you saying that, if/when a large amount of code needs many changes, you simply don't have the courage needed to avoid a lot of wasted effort?
You're right - another advantage of "rewrite from scratch" is that it prevents you from getting distracted by "not relevant yet" parts of your OS and forces you to concentrate on fixing the parts that everything else depends on.Kevin wrote:When you don't feel motivated to continue with the design change on one day, you could still work on another part of your OS - except if it is totally broken because you're somewhere in the middle of a rewrite.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Rewrite from Scratch
Hi,
There is also a question of "fairness". Imagine 100 CPUs all fighting for this same lock - some CPUs may be very lucky and repeatedly acquire the lock, and some CPUs may be very unlucky and never acquire the lock. There are different ways of doing it that ensure "fairness" (which means that CPUs acquire the lock in the order they arrive, and no CPU is "unlucky" and waits for an excessive amount of time to acquire the lock).
Now; most of the above (everything except releasing the lock) only helps when there's lock contention, and can make to "no lock contention" case worse. This means that it can be a good idea to have different spinlock code for different cases - one version where lock contention is likely, and one version where lock contention is very unlikely.
Your first priority should be avoiding lock contention, so that lock contention is very unlikely. This may be difficult though, as often it depends on how many CPUs there are - e.g. when there's only 4 CPUs the chance of lock contention (for a specific lock) may be very unlikely, and when there's 200 CPUs the chance of lock contention (for the same lock) may be very likely.
On top of all of this; you'd want to think about your scheduler. If you acquire a spinlock and then the scheduler decides to do a task switch, then you can have many CPUs all spinning, waiting for you to release that lock, and it may take a long time for that to happen because you're running other tasks. To prevent that you want to be able to postpone task switches when your holding a spin lock (but you may not want to disable IRQs and mess up IRQ latency if you can avoid it).
In addition, if an IRQ handler acquires a lock when the code it has interrupted may have already acquired it, then you end up with deadlocks. To prevent that you might want a special lock that disables IRQs. Of course you might not want to disable *all* IRQs - for example, you might want to disable/postpone most IRQs but still be able to handle some types of IPIs (e.g. "TLB shootdown" IPIs) without waiting for the lock to be released and IRQs to be enabled.
This may mean that you end up with another 2 different types of spinlocks - one that postpones task switches and postpones (most? all?) IRQs which is used for anything that might be used by an IRQ handler; plus another type of spinlock that postpones task switches without disabling IRQs that is used for anything that is never needed by an IRQ handler.
This might mean 4 combinations - with/without IRQ postponing, and likely/unlikely lock contention.
For an example, imagine if you've got a single-CPU monolithic kernel that uses segmentation, doesn't use paging and has a round-robin scheduler; and you want to write a "many CPU" micro-kernel that uses paging, doesn't use segmentation and has a cooperative scheduler. It would be foolish to think small incremental steps is a good idea in this case - the design has changed too much.
Of course if the design isn't being changed very much then rewriting from scratch would be a bad idea. The question then becomes, how much are you planning to change the design, and are you planning to change the design enough to justify "rewrite from scratch"? This is why I suggested writing down what you want the new design to be and comparing your new design to your existing design.
For example, if your kernel currently has a big kernel lock and you want everything (all the data structures, all the locking, and all the code that uses data structures or locks) to be redesigned to achieve the best performance/scalability, then almost everything will change and starting from scratch is a good idea. However, if your kernel currently has a big kernel lock and you just want to remove that big kernel lock without redesigning much at all, then starting from scratch would be a bad idea.
Cheers,
Brendan
It's not broken (it will work as intended). However, it's not very efficient either. The "critical_section_enter" code should use a "test, then (atomically test and set)" approach where the original test doesn't need/use a lock prefix. It should probably also use the "pause" instruction (for hyper-threading, so that the tight loop doesn't hurt the performance of the other logical CPUs in the core as badly). For the "critical_section_leave" code you should be able to "mov $0, lockvariable(%rip)" without any lock prefix, as all 80x86 CPUs guarantee that aligned writes like this are atomic (and you should make sure the lock variable is aligned for performance reasons anyway).Antti wrote:Is the following pseudocode broken:
Code: Select all
/* Critical section (enter and leave) */ /* This data page has 'PCD' and 'PWT' bits set */ .data .balign 8, 0 lockvariable: .8byte 0 .text .globl critical_section_enter .globl critical_section_leave .balign 16, 0 critical_section_enter: lock btsq $0, lockvariable(%rip) /* Bit Test and Set (atomically) */ jc critical_section_enter /* Locked already? If yes, try again. */ retq .balign 16, 0 critical_section_leave: lock btrq $0, lockvariable(%rip) /* Reset (atomically) */ retq
There is also a question of "fairness". Imagine 100 CPUs all fighting for this same lock - some CPUs may be very lucky and repeatedly acquire the lock, and some CPUs may be very unlucky and never acquire the lock. There are different ways of doing it that ensure "fairness" (which means that CPUs acquire the lock in the order they arrive, and no CPU is "unlucky" and waits for an excessive amount of time to acquire the lock).
Now; most of the above (everything except releasing the lock) only helps when there's lock contention, and can make to "no lock contention" case worse. This means that it can be a good idea to have different spinlock code for different cases - one version where lock contention is likely, and one version where lock contention is very unlikely.
Your first priority should be avoiding lock contention, so that lock contention is very unlikely. This may be difficult though, as often it depends on how many CPUs there are - e.g. when there's only 4 CPUs the chance of lock contention (for a specific lock) may be very unlikely, and when there's 200 CPUs the chance of lock contention (for the same lock) may be very likely.
On top of all of this; you'd want to think about your scheduler. If you acquire a spinlock and then the scheduler decides to do a task switch, then you can have many CPUs all spinning, waiting for you to release that lock, and it may take a long time for that to happen because you're running other tasks. To prevent that you want to be able to postpone task switches when your holding a spin lock (but you may not want to disable IRQs and mess up IRQ latency if you can avoid it).
In addition, if an IRQ handler acquires a lock when the code it has interrupted may have already acquired it, then you end up with deadlocks. To prevent that you might want a special lock that disables IRQs. Of course you might not want to disable *all* IRQs - for example, you might want to disable/postpone most IRQs but still be able to handle some types of IPIs (e.g. "TLB shootdown" IPIs) without waiting for the lock to be released and IRQs to be enabled.
This may mean that you end up with another 2 different types of spinlocks - one that postpones task switches and postpones (most? all?) IRQs which is used for anything that might be used by an IRQ handler; plus another type of spinlock that postpones task switches without disabling IRQs that is used for anything that is never needed by an IRQ handler.
This might mean 4 combinations - with/without IRQ postponing, and likely/unlikely lock contention.
You shouldn't need to adjust MTRRs or set the PCD or PWT bits. It will work correctly regardless of whether the lock variable is in "write-back" memory or "uncached" memory or anything else (and you'd want the lock variable in "write-back" memory for performance reasons).Antti wrote:But the question is: do I have to adjust MTRRs (I have not touched them yet)? I am quite not sure whether those "PCD and PWT bits" are sufficient here. Are there anything else that I have forgot to take into account?
No. "Rewrite from scratch" means "most of my old code is unsuitable for my new design and therefore it's a hindrance that will slow progress, and to avoid "tip-toe through the prickle patch" I'm going to cut my losses and save myself a lot of extra work and hassle by discarding all of it (and only using parts of the old code as a reference if/when suitable)".Antti wrote:When it comes to the actual topic, it seems that my "rewrite from scratch" means: just make source code files look nicer.
For an example, imagine if you've got a single-CPU monolithic kernel that uses segmentation, doesn't use paging and has a round-robin scheduler; and you want to write a "many CPU" micro-kernel that uses paging, doesn't use segmentation and has a cooperative scheduler. It would be foolish to think small incremental steps is a good idea in this case - the design has changed too much.
Of course if the design isn't being changed very much then rewriting from scratch would be a bad idea. The question then becomes, how much are you planning to change the design, and are you planning to change the design enough to justify "rewrite from scratch"? This is why I suggested writing down what you want the new design to be and comparing your new design to your existing design.
For example, if your kernel currently has a big kernel lock and you want everything (all the data structures, all the locking, and all the code that uses data structures or locks) to be redesigned to achieve the best performance/scalability, then almost everything will change and starting from scratch is a good idea. However, if your kernel currently has a big kernel lock and you just want to remove that big kernel lock without redesigning much at all, then starting from scratch would be a bad idea.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Rewrite from Scratch
Possible, but when the bugs in the rewrite scenario take 10 times longer to localize, you still end up with a factor 5 longer with the rewrite.Brendan wrote:Wrong. The number of bugs is approximately proportional to the number of lines of code you write. If you write twice as much code because you're doing incremental fixes, then you end up writing and fixing twice as many bugs; even if half the code (and half the bugs you fixed) ends up being replaced.rdos wrote:I don't think that is the issue. Most of the time in a major rewrite is not to write the code, but to find the bugs. So even if the full rewrite might be faster (coding-wise), finding each bug in the full rewrite scenario might take 10 times as long because you have no history of what have changed. But, then, if you never make mistakes, the full rewrite will always be the best option.