Hi,
Ok, what I said was harsh and wasn't very constructive. I still think it was justified - when working with a project for a very long time it's probably natural to become accustomed to the project's eccentricities and start having trouble noticing how bad the overall design has become.
There are things about rdos that are admirable (managing to get it used in commercial products, the perseverance it would've taken, etc). These things have nothing to do with the OS's design though.
In an attempt to be more constructive, I've created a partial list of design flaws.
16-bit Kernel Code
For a 32-bit or 64-bit CPU, 16-bit instructions have false dependencies on the register's previous value. For example, "mov ax,1234" depends on the previous value of EAX/RAX. This means that the CPU can stall waiting for the previous value to of EAX/RAX, which limits the CPU's ability to do "out of order" processing and harms performance. Fixing this problem in 16-bit code requires size override prefixes. Consider this example:
Code: Select all
mov ax,[foo] ;Depends on previous value of EAX/RAX
shl ax,1 ;Depends on previous instruction
mov [bar],ax ;Depends on previous instruction
mov ax,1234 ;Depends on "shl ax,1"
sub ax,bx ;Depends on previous instruction
mov [somewhere],ax ;Depends on previous instruction
This example code could be improved by doing:
Code: Select all
movzx eax,word [foo] ;Depends on nothing
shl eax,1 ;Depends on previous instruction
mov [bar],ax ;Depends on previous instruction
mov eax,1234 ;Depends on nothing
sub ax,bx ;Depends on previous instruction
mov [somewhere],ax ;Depends on previous instruction
For a modern CPU, (with register renaming) this allows the CPU to do instructions in parallel. It may effectively become:
Code: Select all
movzx eax_v1,word [foo], mov eax_v2,1234 ;2 instructions in parallel
shl eax_v1,1, sub ax_v2,bx ;2 instructions in parallel
mov [bar],ax_v1
mov [somewhere],ax_v2 ;Writes occur in program order
For 64-bit CPUs, AMD avoided this problem by making sure that modifying the lowest half of a 64-bit register causes the higher half of the register to be zeroed. For example, "mov eax,[foo]" or "mov eax,1234" does not depend on the previous value of RAX because the CPU zeros the highest 32-bits of RAX (the CPU effectively does "movzx rax,dword [foo]" and "movzx rax,dword 1234" automatically).
Next; indexing in 16-bit is more limited/awkward. For example, you can't do "mov dx,[ax*2+bx]" or "mov ax,[sp+4]" (but can do "mov edx,[eax*2+ebx]" or "mov eax,[esp+4]"). These restrictions mean that you end up with less efficient code to do the same thing. This problem can be avoided a bit by using size override prefixes.
The decoders in modern CPUs are tuned for simple instructions. Instructions with prefixes (not just size override prefixes) are less simple and more likely to reduce decoder efficiency. Exact behaviour depends on the specific CPU. Examples include "instructions with multiple prefixes can only be decoded by the first decoder" (Pentium M), and "For pre-decode, prefixes that change the default length of an instruction have a 3 cycle (Sandy Bridge) or 6 cycle penalty (Nehalem)".
For code size (and instruction fetch), in general (for small pieces of code) 32-bit code may be a slightly smaller or slightly larger than the equivalent 16-bit code, and for large pieces of code it averages out to "irrelevant". Given that the code size is irrelevant, 16-bit code just means you get worse performance without size overrides and worse performance with size overrides.
Segmentation
For modern CPUs; segmentation is a disaster. Segment register loads are very expensive due to the need to do some checks (e.g. is it beyond the GDT limit), then fetch the descriptor (including any TLB misses, etc), then do more protection checks. Using different segment registers for different pieces of data also means that you end up using lots of segment override prefixes, which increase code size and causes inefficiency in a modern CPU's "tuned for simple instructions" decoder.
For memory management; segmentation is a disaster. When segmentation and paging are both used together (which is necessary to avoid serious physical address space fragmentation problems) you end up with 2 memory managers - one to manage segments and one to manage pages; where one of them is redundant.
For programmers and tools (compilers, etc); segmentation is a disaster. It's much easier to work with one contiguous space than it is to juggle many small individual areas.
The "advantage" of segmentation is an illusion. Because it fails to catch all "unintended" accesses (and introduces the new possibility of "right offset, wrong segment" bugs) it does nothing more than create a false sense security.
Physical Memory Management
Bitmaps are slow, are harder to avoid scalability problems, and make it harder to support things like page colouring and NUMA optimisations. Double-free detection only matters if your virtual memory management is buggy and therefore shouldn't matter at all. As far as I can tell, the main reason you like bitmaps is that segmentation sucks and makes better approaches hard.
Virtual Memory Management
This was a huge "failed to abstract". It's probably improved a lot since you started attempting to add support for PAE, but it's very likely that it's still bad. The "many segments" way of accessing paging structures is aweful. When you start attempting to support 64-bit it's going to fail miserably (doing half of virtual memory management in protected mode and half in long mode, and frequently switching from protected mode to long mode and destroying all TLBs, etc highlights a phenomenal abundance of "stupid").
Kernel API
Not supporting sane error handling is a severe mistake. It's likely to cause a lot of unnecessary overhead and unsolvable user-space race conditions (e.g. file IO failures despite pre-checks, where the cause of the error vanishes, because some other process did something at the wrong time); such that writing robust applications is a nightmare.
Also; syscalls should never have used pointers of any kind.
Boot Code
The old boot code (which you seem to refuse to update in a "better than superficial" way) has outlived it's usefulness. The "memory map" it provided was always bad. The entire interface between the boot code and the kernel probably needs to be redesigned to remove flaws and make it extensible (and maybe also to prepare for UEFI).
Summary
This is just a beginning. It's like a large crate of 1010 oranges - if you try 10 oranges and find that 7 are bad you just assume that 700 of the oranges that you haven't tried are also bad. If I had all the details for RDOS and had enough time to go through all of the source code, I expect I could write many pages of problems.
Of course the right time to start a rewrite would've been before SMP was added. You could've looked around to see if there were other causes of "scope creep" coming, and realised that you'd want to add ACPI and power management, and PAE and 64-bit, and maybe UEFI (and maybe ARM?) in the near future; and then assessed the original design to see how many old design flaws could also be fixed. You could've come to the conclusion a new "RDOS 2" (with limited backward compatability or no backward compatability) was justified and that very little of the "RDOS 1" code would be useful for "RDOS 2".
You think "incremental change" works, but (even though you've already rewritten most of the kernel at least once already) most of the old problems are still there. Most of the old problems will never be fixed.
Cheers,
Brendan