What features would you like in a CPU?
- Owen
- Member
- Posts: 1700
- Joined: Fri Jun 13, 2008 3:21 pm
- Location: Cambridge, United Kingdom
- Contact:
What features would you like in a CPU?
As someone who is experimenting with designing his own CPU (To be implemented in Verilog HDL), I'd like to ask you all a question:
As developers - both OS and application - what features would you most like to see in a new CPU? What unusual features from present CPUs do you find especially valuable?
Now, first things first, this design will still be mostly conventional. I'm aiming at a RISC processor, with some more complex opcodes thrown in where it is thought they would be valuable. (E.G. strcpy which does 4 byte copies but automatically performs masked memory accesses to handle misaligned starts, and to handle the end NULL)
Another important question is this: How would you like interrupts implemented?. I'm especially asking this from a NUMA multiprocessor scenario, even though my design is likely to be initially single chip single core. I particularly ask this from a position where the the system's processors may be assymetric; for example, there may be a 16-bit processor closely coupled to the network controller; the main CPU(s) obviously need to be able to interrupt this, and it obviously needs to be able to interrupt them, but it would just be very silly sending this microcontroller graphics card interrupts.
Finally, How many registers do you think are appropriate?. Note that a processor need not save all registers on a context switch - if they are saved to a special region of memory rather than the stack, an extra (hidden) bit can be added to each register to tell the processor it is not modified and need not be saved.
(Just a note: I won't be implementing a 32-bit processor for a while yet; I'm currently designing that aforementioned 16-bit processor. It has 16:16 linear segment:offset addressing; thus it can access the full 32-bit address space. It's main purpose is, as I mentioned, high performance peripheral offloading. The processor would have two busses - one which accesses main system memory and another which accesses it's private memory - which would include RAM for code and the I/O registers of any peripherals it was controlling. If the processor is sleeping, then the busses would just pass through)
As developers - both OS and application - what features would you most like to see in a new CPU? What unusual features from present CPUs do you find especially valuable?
Now, first things first, this design will still be mostly conventional. I'm aiming at a RISC processor, with some more complex opcodes thrown in where it is thought they would be valuable. (E.G. strcpy which does 4 byte copies but automatically performs masked memory accesses to handle misaligned starts, and to handle the end NULL)
Another important question is this: How would you like interrupts implemented?. I'm especially asking this from a NUMA multiprocessor scenario, even though my design is likely to be initially single chip single core. I particularly ask this from a position where the the system's processors may be assymetric; for example, there may be a 16-bit processor closely coupled to the network controller; the main CPU(s) obviously need to be able to interrupt this, and it obviously needs to be able to interrupt them, but it would just be very silly sending this microcontroller graphics card interrupts.
Finally, How many registers do you think are appropriate?. Note that a processor need not save all registers on a context switch - if they are saved to a special region of memory rather than the stack, an extra (hidden) bit can be added to each register to tell the processor it is not modified and need not be saved.
(Just a note: I won't be implementing a 32-bit processor for a while yet; I'm currently designing that aforementioned 16-bit processor. It has 16:16 linear segment:offset addressing; thus it can access the full 32-bit address space. It's main purpose is, as I mentioned, high performance peripheral offloading. The processor would have two busses - one which accesses main system memory and another which accesses it's private memory - which would include RAM for code and the I/O registers of any peripherals it was controlling. If the processor is sleeping, then the busses would just pass through)
- Love4Boobies
- Member
- Posts: 2111
- Joined: Fri Mar 07, 2008 5:36 pm
- Location: Bucharest, Romania
Re: What features would you like in a CPU?
Precise interrupts? More to come.
"Computers in the future may weigh no more than 1.5 tons.", Popular Mechanics (1949)
[ Project UDI ]
[ Project UDI ]
- Owen
- Member
- Posts: 1700
- Joined: Fri Jun 13, 2008 3:21 pm
- Location: Cambridge, United Kingdom
- Contact:
Re: What features would you like in a CPU?
You have to throw deterministic interrupts out the window when you reach the point at which cache becomes a necessity. Which is in pretty much any system using DRAM, since that has quite high latencies, making it hate random and love burst accesses.
The other problem with deterministic interrupts is that you need multiple register files. With big ones, or with systems which can nest interrupts, you begin having problems here.
The other problem with deterministic interrupts is that you need multiple register files. With big ones, or with systems which can nest interrupts, you begin having problems here.
Re: What features would you like in a CPU?
When I build my RISC CPU, here are a few opcode things that I intend to implement:
Addressing modes that include subtraction -- ie. [esi - 8*ebx]
A version of a MOV instruction that sets some eflags -- esp ZF.
An entire set of arithmetic operations that do not modify any eflags.
A version of INC and DEC instructions that add/sub 2, 4, and 8.
A version of an LEA instruction that sets all eflags based on the result.
Register indirection -- ie. accessing register numbers as an "array" from another register.
An opcode that will burst transfer a large number of registers to/from memory, and set a flag on completion.
A CALL instruction that stores a magic number on the stack, and a RET that autoverifies it, to prevent stack corruption.
Addressing modes that include subtraction -- ie. [esi - 8*ebx]
A version of a MOV instruction that sets some eflags -- esp ZF.
An entire set of arithmetic operations that do not modify any eflags.
A version of INC and DEC instructions that add/sub 2, 4, and 8.
A version of an LEA instruction that sets all eflags based on the result.
Register indirection -- ie. accessing register numbers as an "array" from another register.
An opcode that will burst transfer a large number of registers to/from memory, and set a flag on completion.
A CALL instruction that stores a magic number on the stack, and a RET that autoverifies it, to prevent stack corruption.
Re: What features would you like in a CPU?
That sounds like a CISC CPU.bewing wrote:When I build my RISC CPU, here are a few opcode things that I intend to implement:
Addressing modes that include subtraction -- ie. [esi - 8*ebx]
A version of a MOV instruction that sets some eflags -- esp ZF.
An entire set of arithmetic operations that do not modify any eflags.
A version of INC and DEC instructions that add/sub 2, 4, and 8.
A version of an LEA instruction that sets all eflags based on the result.
Register indirection -- ie. accessing register numbers as an "array" from another register.
An opcode that will burst transfer a large number of registers to/from memory, and set a flag on completion.
A CALL instruction that stores a magic number on the stack, and a RET that autoverifies it, to prevent stack corruption.
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
Re: What features would you like in a CPU?
what i think should be added as a feature is that the software can send a kill signal to the cpu and have the system stay online in a suspended state then have the option of restarting the cpu and amds cool and quite feature. it will only go fast when there is a bigger lead so it stays cool. anyways why not make a maching motherboard and have it run a homemade os?
Cpu master owns your cpu.
- Owen
- Member
- Posts: 1700
- Joined: Fri Jun 13, 2008 3:21 pm
- Location: Cambridge, United Kingdom
- Contact:
Re: What features would you like in a CPU?
I'm half way there - all relative offests are signed; though my most complex addressing mode is MOV r0, [r1 + r2 * 8 + 4]; also note that it's pure load store excepting the complex instructions like STRCPY, MEMCPY, etc (And that the * in the above is really a bitshift)bewing wrote:When I build my RISC CPU, here are a few opcode things that I intend to implement:
Addressing modes that include subtraction -- ie. [esi - 8*ebx]
Quite possible with MOV reg, reg; not so much with memory moves, since that kind of thing is a good way to gobble opcodes very quicklyA version of a MOV instruction that sets some eflags -- esp ZF.
Why? Very rarely do I see a case where you want to do some form of comparison and keep the results arround for a long timeAn entire set of arithmetic operations that do not modify any eflags.
I've already designed that in as a form of the ADD.literal instructionA version of INC and DEC instructions that add/sub 2, 4, and 8.
Anotrher way to gobble instructions I'm considering a LEA, but I'm not devoting 75% of my opcode space to addressing when it could be used for more useful instructions!A version of an LEA instruction that sets all eflags based on the result.
Seriously - why? I've never come accross a need to use an array of dynamically indexed registersRegister indirection -- ie. accessing register numbers as an "array" from another register.
From memory is easy enough, but to it consumes large quantities of silicon. I have a PFETCH instruction which tells the processor to fetch an address that I don't need right now - with the caveat that said fetch can blow up in the consuming instruction. That is, it's perfectly valid to get a page fault when doing ADD r0, r1 if r1 was set for a PFETCH and your address triggered one. Debugger developers will love this one, since I imagine it will add lots of heinous backtracking! Then again, I'm gonna love implementing the logic to insert MOV rX, [rX] instructions into the pipeline to handle the unloaded case.An opcode that will burst transfer a large number of registers to/from memory, and set a flag on completion.
Interesting idea - though one which would probably have to be a double cycle instruction in order to handle the two memory accesses.A CALL instruction that stores a magic number on the stack, and a RET that autoverifies it, to prevent stack corruption.
I beleive your "kill signal" is called the HLT instruction, which halts the CPU until an interrupt arrives. As for speed throttling, I'm unlikely to support it; though I will support completely stopping the clock to the execution unit and all it's periphery when a halt instruction is executed.cpumaster wrote:what i think should be added as a feature is that the software can send a kill signal to the cpu and have the system stay online in a suspended state then have the option of restarting the cpu and amds cool and quite feature. it will only go fast when there is a bigger lead so it stays cool. anyways why not make a maching motherboard and have it run a homemade os?
As for a motherboard - I'd probably design some form of motherboard with the CPU in a FPGA on it but I'd want to implement it on a development board first - getting one thing working at a time is better for my sanity!
Re: What features would you like in a CPU?
On mine, I'm planning on having 2 million or more registers, and "reducing" the instruction set in other ways than eliminating basic integer opcodes. I don't like cramming my concepts into predefined boxes -- you can call it anything you want. And my suggestions are always meant to be filtered through what is "doable".Korona wrote: That sounds like a CISC CPU.
A "long time" is rarely necessary, but I do run into cases where I need to do some quick calculation (usually just before a return) without messing up the state of my return flags.Owen wrote: Why? Very rarely do I see a case where you want to do some form of comparison and keep the results arround for a long time
This is especially true for fixing up the stack pointer (removing automatic storage) before a return, if you do not have an LEA instruction. You need to do an ADD ESP, 0x?? -- and that's going to screw up all of your return EFLAGS.
One of the biggest deficiencies in modern CPUs is that you cannot store a simple array in registers. Do you realize how many clock cycles you could save if you could store an 8 entry int array in registers, rather than having it be in cached (hopefully) main memory?Seriously - why? I've never come accross a need to use an array of dynamically indexed registers
(In my design, the intent is that I am going to be eliminating cache completely -- so I need to be able to do ALL forms of addressing with registers only -- and there will be NO main memory addressing modes, except the one "load register block from main mem" and "store register block to main mem".)
- Owen
- Member
- Posts: 1700
- Joined: Fri Jun 13, 2008 3:21 pm
- Location: Cambridge, United Kingdom
- Contact:
Re: What features would you like in a CPU?
Not to be picky - but at one clock cycle per instruction, you need at least 3 ports on your register file for a traditional two or three operand instruction set, and the size of a register tends to go up with the square of the number of ports. This is why processors tend to have smaller numbers of registersbewing wrote:On mine, I'm planning on having 2 million or more registers
The problem with operating without cache is that a processor accesses memory - in random places - far faster than the memory's clock rate. And when you throw in the randonimity, you spend more memory cycles on latencies than on actual access cycles.
Re: What features would you like in a CPU?
Besides being problematic with opcodes, as Owen mentioned already. A processor with 2M+ registers would be quite expensive to make. with 2M+ registers you'd sort of add a register cache. And even though cache size has significantly grown later it continues to be expensive.I'm planning on having 2 million or more registers, and "reducing" the instruction set in other ways than eliminating basic integer opcodes. I don't like cramming my concepts into predefined boxes -- you can call it anything you want.
That would be heavenly.A version of INC and DEC instructions that add/sub 2, 4, and 8
Intel Speedstep? on-demand Underclocking and in the core i7 also overclocking?what i think should be added as a feature is that the software can send a kill signal to the cpu and have the system stay online in a suspended state then have the option of restarting the cpu and amds cool and quite feature. it will only go fast when there is a bigger lead so it stays cool. anyways why not make a maching motherboard and have it run a homemade os?
Modular Interface Kernel With a lot of bugs
- Troy Martin
- Member
- Posts: 1686
- Joined: Fri Apr 18, 2008 4:40 pm
- Location: Langley, Vancouver, BC, Canada
- Contact:
Re: What features would you like in a CPU?
Free access to an 8-16 KB cache.
And maybe reverse DIV and MUL (Say, DIVR 3 divides DX by 3) for the sake of not using XCHG.
And maybe reverse DIV and MUL (Say, DIVR 3 divides DX by 3) for the sake of not using XCHG.
- Owen
- Member
- Posts: 1700
- Joined: Fri Jun 13, 2008 3:21 pm
- Location: Cambridge, United Kingdom
- Contact:
Re: What features would you like in a CPU?
Waste of cache. Either the OS ends up in the cache, in which case you have to decide which pieces of the OS to store there, or each application can load a bit of itself into the cache and you have to switch it out every task switch. In any case, the processor knows betterTroy Martin wrote:Free access to an 8-16 KB cache.
Does DIV rX, 3 not do that? Perhaps you mean DIV 3, rX, in which case I see little point as you'll end up needing big literals for it to be worthwhile. In any case, with 32 GPRs, the register pressure should be sufficiently low to cope with a limited number of instructions supporting literalsAnd maybe reverse DIV and MUL (Say, DIVR 3 divides DX by 3) for the sake of not using XCHG.
Re: What features would you like in a CPU?
OS: DCAS ; Applicationy-stuff: Multiply-accumulate, tagged arithmetic.
If the cpu is going to play microcontroller, a (or more) user-accessible shift register, with selectable taps/feedback! Clock dividers, PRNGs, and convolutional encoders suddenly don't take lots of software bit shuffling.
If the cpu is going to play microcontroller, a (or more) user-accessible shift register, with selectable taps/feedback! Clock dividers, PRNGs, and convolutional encoders suddenly don't take lots of software bit shuffling.
--vs
- Owen
- Member
- Posts: 1700
- Joined: Fri Jun 13, 2008 3:21 pm
- Location: Cambridge, United Kingdom
- Contact:
Re: What features would you like in a CPU?
One thought I had was a fixed point math mode. An example of this would be
XQUOT 10 (Set flags to indicate 10-bit quotient)
XMUL rA, rB, rC (rC = rA * rB >> Flags.Quotient; 64-bit intermediates)
XDIV rA, rB, rC (rC = (rA << Flags.Quotient) / rB)
(Why the X? F is reserved for floating point, and it's fiXed point)
XQUOT 10 (Set flags to indicate 10-bit quotient)
XMUL rA, rB, rC (rC = rA * rB >> Flags.Quotient; 64-bit intermediates)
XDIV rA, rB, rC (rC = (rA << Flags.Quotient) / rB)
(Why the X? F is reserved for floating point, and it's fiXed point)
Re: What features would you like in a CPU?
The single instruction I would like that's not on a x86 as far as I know is an instruction to fetch a noncached memory value, going around the value in cache. I'm pretty sure this would be handy.
Suppose you have two cores and a global memory variable which is cached. They both have accessed recently so each core has it in it's local cache. Core #0 changes it. Core #0 can write-back invalidate cache, writing everything out of cache. (Just pushing one value from cache to mem would be good, but WbInvd does the job.) Now, you have core #1... how do you fetch the updated value sitting in memory when you have a different value in your local cache? There's no way to do it that I know of. If you WbInvd invalidate, it might write-it-out and clobber the value in memory. If you Invalidate cache without writing-back, you just lost changes to other things. Maybe, WbInvd does not write unmodified values sitting in cache, in which case there is a way, since it might be acceptable if it works only in one direction #0->#1 or #1->#0. If WbInvd writes all to memory, it'll clobber the value you are trying to fetch.
Maybe, there's a reason they don't have it--lots of things are easier said than done.
You can of course use uncached memroy pages, but, for example, I have task records and I don't want the whole record uncached -- often I just want one value to be fetched or stored. It's inconvient to have two separate records with half in a cached area and half in an uncached.
Suppose you have two cores and a global memory variable which is cached. They both have accessed recently so each core has it in it's local cache. Core #0 changes it. Core #0 can write-back invalidate cache, writing everything out of cache. (Just pushing one value from cache to mem would be good, but WbInvd does the job.) Now, you have core #1... how do you fetch the updated value sitting in memory when you have a different value in your local cache? There's no way to do it that I know of. If you WbInvd invalidate, it might write-it-out and clobber the value in memory. If you Invalidate cache without writing-back, you just lost changes to other things. Maybe, WbInvd does not write unmodified values sitting in cache, in which case there is a way, since it might be acceptable if it works only in one direction #0->#1 or #1->#0. If WbInvd writes all to memory, it'll clobber the value you are trying to fetch.
Maybe, there's a reason they don't have it--lots of things are easier said than done.
You can of course use uncached memroy pages, but, for example, I have task records and I don't want the whole record uncached -- often I just want one value to be fetched or stored. It's inconvient to have two separate records with half in a cached area and half in an uncached.