@Axilmar: Some notes:
- - I'm curious why special purpose registers and addresses are only 32-bit
- there's no need for a "greater than" flag if you've got "less than" and "equals" flags (if something is not less than and not equal, then it's greater).
- I'm not sure how you'd implement something like "copy on write" with your memory management (no "read only" flag in the page itself)
- to me, it looks like the segmentation is going to be an extreme performance problem - for every memory access the CPU will need to search through the Segment Table and then search through the Access Table. Even if this is cached in the CPU it'll still be slow (especially if the number of entries in the table/s is more than the number of entries in the special cache).
- I don't know why you're using 2 seperate tables (Access Table and Segment Table) instead of having one table with all the information.
- you don't specify what happens if different segments overlap (e.g. one segment that says an area is "read only" and another segment that says the same area is "write only").
- Atomic Execution is going to have severe contention/scalability problems and fairness problems in many-CPU systems. For e.g. imagine one CPU continually doing atomic loads that prevents any other CPU from progressing.
- the interrupt handling won't work in some cases. For example, if doing "call foo" causes a page fault (not-present page at SR) then the CPU will go into a "continuous page fault loop" (or triple fault?). In this case, if any unpriveleged code can change SR (or continually push return addresses onto the return stack) then any unpriveleged code would be able cause the OS to lock up completely (or crash).
- I couldn't find any instructions for zero-extended loads (e.g "movzx eax,byte [foo]"), even though you've got lots of instructions for sign-extended loads (e.g. LDVB, LDB, LDXB, LDRB, etc). I'm not sure if this is deliberate or not (a sign-extended load could be followed by an additional AND instruction to make it work like a zero-extended load). See the note.
- you're missing MOD instructions for signed and unsigned integers, and it's more efficient for DIV and DIVI to return both the quotient and remainder (in some cases you can do the division itself once rather than doing DIV and MOD, especially if you're trying to divide large integers - e.g. dividing a 256-bit integer by a 64-bit integer). See the note.
- there's no "ADC" or "SBB" instructions, which makes operating on 128-bit or larger integers slow (e.g. you can't do "ADD R0, R2; ADC R1, R3"). Doing it manually (e.g. with "JMPC C") is messy and causes branch misprediction problems. See the note.
- there's no way for software to identify which version of the CPU they're running on and no way for software to determine if features you add in future are present
- there's no details on which conditions cause which exceptions (e.g. debug exception, security exception, etc) and I'm not sure which instructions are privileged instructions (for e.g. can unpriveleged code use the "MOVSR" instruction?)
- there's no details for the Interrupt Table (entry format, etc)
- there's no cache management instructions (e.g. CLFLUSH), no mention of a TLB for paging data structures, and no details of whether or not the CPU maintains consistancy between RAM and the TLB (if any) and special cache (used for the Access Table and Segment Table) or not. I'm guessing the CPU shouldn't maintain consistancy for the TLB and special cache to reduce the number of (frequently unecessary) checks that the CPU needs to do (and improve performance) but this would require instructions for manually invalidating these caches when paging and/or the Segment Table and/or Access Table are changed.
Cheers,
Brendan