Future of CPUs

Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
Post Reply
geppyfx
Member
Member
Posts: 87
Joined: Tue Apr 28, 2009 4:58 pm

Re: Future of CPUs

Post by geppyfx »

Future of processors by 2018 according to DARPA http://news.cnet.com/8301-13924_3-20013088-64.html (its not the same news where darpa asks to nvidia to build supercomputer out of cuda gpus)
skyking
Member
Member
Posts: 174
Joined: Sun Jan 06, 2008 8:41 am

Re: Future of CPUs

Post by skyking »

arkain wrote:
darkestkhan wrote:You couldn't build a 1024 bit adder with appreciably better performance. Never mind multipliers or dividers - heck, division is already slow. At the scaling rate for division that current x86s manage (and most RISC machines are worse!), that division is gonna take you 1032 cycles. Multiplication will probably be 100.
I wouldn't say that. As a concept for a computer architecture paper, I designed a completely new adder with 25% less circuits and 30% more speed than the standard carry-lookahead adder (neither circuit containing optimizations). This advantage starts at 16-bit and grows as the adder is widened. I'm fairly certain that even this design is not the best that anyone can come up with. So it's not a good idea to assume that what we currently understand to be the best will remain so in the future.
I guess it's more of what's really requiered that is interresting. If you think of it which applications does really require 1024 bit arithmetics? And I don't count people that use 1024 bits just because they don't understand maths. 2^1024 is a really huge number, and 1024 bit floats means really high precision.

That you're getting 30% more speed means nothing, I'd guess that you could use that design for 64-bit adders as well so your 1024-bit adder has to compare with the improved 64-bit adder. One could also question if an divider is worth it at all since it uses valuable silicon space that will almost never be used anyway (a lot of CPU's actually does not include the div instructions at all).
User avatar
Candy
Member
Member
Posts: 3882
Joined: Tue Oct 17, 2006 11:33 pm
Location: Eindhoven

Re: Future of CPUs

Post by Candy »

These p4 were the most power-hungry implementations in intel's history, I have one lying in the closet and you can really warm your room in winter with that.
Same here - had an i805, was powerful enough to heat the cooler to 75C - and to get into thermal throttling - before windows booted up. Replaced with more powerful and dual core c2d and it can't get a cheaper cooler up to 60C.
ds
Posts: 4
Joined: Sat Aug 21, 2010 11:50 pm

Re: Future of CPUs

Post by ds »

Hmm, having read through the majority of posts here, I have but one thing to say:

"Cellular Automata" ...

Anything with 1000's of cores is starting to resemble one, no?
arkain
Posts: 7
Joined: Fri Aug 13, 2010 1:50 pm

Re: Future of CPUs

Post by arkain »

skyking wrote:
arkain wrote:
darkestkhan wrote:You couldn't build a 1024 bit adder with appreciably better performance. Never mind multipliers or dividers - heck, division is already slow. At the scaling rate for division that current x86s manage (and most RISC machines are worse!), that division is gonna take you 1032 cycles. Multiplication will probably be 100.
I wouldn't say that. As a concept for a computer architecture paper, I designed a completely new adder with 25% less circuits and 30% more speed than the standard carry-lookahead adder (neither circuit containing optimizations). This advantage starts at 16-bit and grows as the adder is widened. I'm fairly certain that even this design is not the best that anyone can come up with. So it's not a good idea to assume that what we currently understand to be the best will remain so in the future.
I guess it's more of what's really required that is interesting. If you think of it which applications does really require 1024 bit arithmetics? And I don't count people that use 1024 bits just because they don't understand maths. 2^1024 is a really huge number, and 1024 bit floats means really high precision.

That you're getting 30% more speed means nothing, I'd guess that you could use that design for 64-bit adders as well so your 1024-bit adder has to compare with the improved 64-bit adder. One could also question if an divider is worth it at all since it uses valuable silicon space that will almost never be used anyway (a lot of CPU's actually does not include the div instructions at all).
I can conceive of a use for such insane precision, try a spacial navigation system. Consider that you'd have to be able to track the precise position of a myriad of stellar objects, especially while traveling at near light speeds. But I will agree that for most practical purposes, at least for the conceivable future, anything beyond 256-bit precision isn't even scientifically useful.

As for why one would implement division in hardware, although it's a relatively little-used function for most common applications, anyone doing heavy math (accounting, scientific, banking, data mining, statistics, etc) certainly appreciates the speed increase provided by performing division in silicon rather than software. One of the points to making transistors smaller is so that there is more room on the die for useful functions. It wouldn't hurt if someone came up with better algorithms for multiplication and division either.
User avatar
Candy
Member
Member
Posts: 3882
Joined: Tue Oct 17, 2006 11:33 pm
Location: Eindhoven

Re: Future of CPUs

Post by Candy »

arkain wrote:I can conceive of a use for such insane precision, try a spacial navigation system. Consider that you'd have to be able to track the precise position of a myriad of stellar objects, especially while traveling at near light speeds. But I will agree that for most practical purposes, at least for the conceivable future, anything beyond 256-bit precision isn't even scientifically useful.

As for why one would implement division in hardware, although it's a relatively little-used function for most common applications, anyone doing heavy math (accounting, scientific, banking, data mining, statistics, etc) certainly appreciates the speed increase provided by performing division in silicon rather than software. One of the points to making transistors smaller is so that there is more room on the die for useful functions. It wouldn't hurt if someone came up with better algorithms for multiplication and division either.
I know the decimals of PI until the 45th decimal - that's exactly the amount that you need to calculate the circumference of the universe given the width of the universe, in precision of electron-thicknesses. That's about 150 bits of precision. 256 bits is already stretching it - not to mention that 1024 is even way more excessivity.

You can only hope to use it for pure mathematics - cryptography for instance. But if you go that direction, you're better off with variable length integers (or floats) and variable-length opcodes, or software emulation (GMP for instance).
arkain
Posts: 7
Joined: Fri Aug 13, 2010 1:50 pm

Re: Future of CPUs

Post by arkain »

Candy wrote:
arkain wrote:I can conceive of a use for such insane precision, try a spacial navigation system. Consider that you'd have to be able to track the precise position of a myriad of stellar objects, especially while traveling at near light speeds. But I will agree that for most practical purposes, at least for the conceivable future, anything beyond 256-bit precision isn't even scientifically useful.

As for why one would implement division in hardware, although it's a relatively little-used function for most common applications, anyone doing heavy math (accounting, scientific, banking, data mining, statistics, etc) certainly appreciates the speed increase provided by performing division in silicon rather than software. One of the points to making transistors smaller is so that there is more room on the die for useful functions. It wouldn't hurt if someone came up with better algorithms for multiplication and division either.
I know the decimals of PI until the 45th decimal - that's exactly the amount that you need to calculate the circumference of the universe given the width of the universe, in precision of electron-thicknesses. That's about 150 bits of precision. 256 bits is already stretching it - not to mention that 1024 is even way more excessivity.

You can only hope to use it for pure mathematics - cryptography for instance. But if you go that direction, you're better off with variable length integers (or floats) and variable-length opcodes, or software emulation (GMP for instance).
I knew you'd argue that 256 was a bit much. Now given that your argument mentions the need for 150-bit precision, and the trend in CPU precision follows powers of 2, to do that 150-bit calculation, you'd still need a 256 bit CPU if you wanted to do so in a minimum of instructions.

In the end, I still agree with you. Higher bit precision isn't going to get us very much if anything. What we need is better hardware encodings for our instructions so that all instructions can be completed in less clock cycles. Frankly, I'd much rather have a 400Mhz CPU where all instructions take 4 clocks instead of a 4Ghz CPU where every instruction takes 40 clocks. The latter CPU is going to be more prone to high power consumption and destroying itself with heat.
User avatar
Owen
Member
Member
Posts: 1700
Joined: Fri Jun 13, 2008 3:21 pm
Location: Cambridge, United Kingdom
Contact:

Re: Future of CPUs

Post by Owen »

...You are aware that most processors complete the vast majority of their instructions in one clock - and modern ones (i.e. x86 circa 95+, ARM Cortex-A9, etc) multiple at a time? Hell, x86 often does instructions in zero clocks.

There are instructions which are slow, but they are often rightfully slow (for example, division); and I don't get how you assume instruction encoding has any bearing on performance.
TylerH
Member
Member
Posts: 285
Joined: Tue Apr 13, 2010 8:00 pm
Contact:

Re: Future of CPUs

Post by TylerH »

Candy wrote:
arkain wrote:I can conceive of a use for such insane precision, try a spacial navigation system. Consider that you'd have to be able to track the precise position of a myriad of stellar objects, especially while traveling at near light speeds. But I will agree that for most practical purposes, at least for the conceivable future, anything beyond 256-bit precision isn't even scientifically useful.

As for why one would implement division in hardware, although it's a relatively little-used function for most common applications, anyone doing heavy math (accounting, scientific, banking, data mining, statistics, etc) certainly appreciates the speed increase provided by performing division in silicon rather than software. One of the points to making transistors smaller is so that there is more room on the die for useful functions. It wouldn't hurt if someone came up with better algorithms for multiplication and division either.
I know the decimals of PI until the 45th decimal - that's exactly the amount that you need to calculate the circumference of the universe given the width of the universe, in precision of electron-thicknesses. That's about 150 bits of precision. 256 bits is already stretching it - not to mention that 1024 is even way more excessivity.

You can only hope to use it for pure mathematics - cryptography for instance. But if you go that direction, you're better off with variable length integers (or floats) and variable-length opcodes, or software emulation (GMP for instance).
Intel's new 256 bit extension: http://software.intel.com/en-us/avx/ :P
User avatar
Candy
Member
Member
Posts: 3882
Joined: Tue Oct 17, 2006 11:33 pm
Location: Eindhoven

Re: Future of CPUs

Post by Candy »

Owen wrote:... I don't get how you assume instruction encoding has any bearing on performance.
Why did Intel create IA-64? Why is it a VLIW architecture with 3 instructions per encoded "instruction word"? Because it's not a bottleneck?
TylerAnon wrote:Intel's new 256 bit extension: http://software.intel.com/en-us/avx/ :P
I was arguing against 1024-bit stuff, reasoning that for all but cryptography and pure mathematics, 256 bits is just about always enough for a precise answer. Of course, you also know that Intel added AES instructions, specifically for cryptography. I'm guessing we'll see 512-bit SHA3 instructions too, for the same reason.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Future of CPUs

Post by Brendan »

Hi,
Candy wrote:
Owen wrote:... I don't get how you assume instruction encoding has any bearing on performance.
Why did Intel create IA-64? Why is it a VLIW architecture with 3 instructions per encoded "instruction word"? Because it's not a bottleneck?
As far as I know, the idea was to shift instruction scheduling out of the CPU and into the compiler; to reduce the complexity of the silicon, and maybe get better performance (didn't happen), better power consumption (also didn't happen), or reduced development costs (still didn't happen). In general the idea fails. When Itanium was introduced there wasn't a compiler capable of doing the instruction scheduling well (and I'm not even sure if such a compiler exist now), so performance was worse than the theoretical maximum. The other (long term) problem is that optimum instruction scheduling is very CPU-dependant - code tuned for one Itanium CPU (with one set of instruction latencies, etc) isn't tuned for another Itanium CPU (with a different set of instruction latencies, etc), and therefore even if a perfect compiler capable of generating optimum code existed, the resulting code would run well on one CPU and run poorly on other CPUs. If the CPU does the instruction scheduling (e.g. out-of-order CPUs), then there's much less need to tune code for a specific CPU.

The other way to "fix" the problem is to use something like hyper-threading to hide the problems caused by poor instruction scheduling. Unfortunately most Itaniums didn't support hyper-threading - Tukwila (released in 2010) is the first Itanium to support it.

The main reason Itanium was relatively successful in specific markets (e.g. for "enterprise" class hardware) was that a lot of scalability and fault tolerance features were built into the chipsets. Basically, Itanium got where it is now despite VLIW (not because of VLIW), although there's a lot of other factors that effected both the success (in some markets) and lack of success (in all other markets).


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
skyking
Member
Member
Posts: 174
Joined: Sun Jan 06, 2008 8:41 am

Re: Future of CPUs

Post by skyking »

arkain wrote: I can conceive of a use for such insane precision, try a spacial navigation system. Consider that you'd have to be able to track the precise position of a myriad of stellar objects, especially while traveling at near light speeds. But I will agree that for most practical purposes, at least for the conceivable future, anything beyond 256-bit precision isn't even scientifically useful.
I'd guess that's what my (or anybody else's) old grandma would want to use a computer for...
As for why one would implement division in hardware, although it's a relatively little-used function for most common applications, anyone doing heavy math (accounting, scientific, banking, data mining, statistics, etc) certainly appreciates the speed increase provided by performing division in silicon rather than software.
Question is here if the performance improvment in putting this in dedicated hardware would be much more than the improvment you'd gain in using the available resources to enhance more often used operations (maybe you even don't get increased performance at all...).
One of the points to making transistors smaller is so that there is more room on the die for useful functions.
Yes, and make them faster and reduce relative power consumtion, but the question is what's the best use of the smaller transistor size. I'm not convinced that you'd get the best general purpose performance by using the available resources for special purpose circuitry.
User avatar
Thomas
Member
Member
Posts: 281
Joined: Thu Jun 04, 2009 11:12 pm

Re: Future of CPUs

Post by Thomas »

Hi,
Why is it a VLIW architecture with 3 instructions per encoded "instruction word"? Because it's not a bottleneck?
A bundle to be more precise ( A bundle is 128 bits in length). Brendan pretty much covered everything.
When Itanium was introduced there wasn't a compiler capable of doing the instruction scheduling well (and I'm not even sure if such a compiler exist now),
HP C/C++ compilers does a decent job :wink: .

--Thomas
User avatar
Owen
Member
Member
Posts: 1700
Joined: Fri Jun 13, 2008 3:21 pm
Location: Cambridge, United Kingdom
Contact:

Re: Future of CPUs

Post by Owen »

Thomas wrote:
When Itanium was introduced there wasn't a compiler capable of doing the instruction scheduling well (and I'm not even sure if such a compiler exist now),
HP C/C++ compilers does a decent job :wink: .

--Thomas
But not compared to an x86 branch predictor. As it turns out, the branch predictor knows far more about the execution paths of code than compilers do.
skyking
Member
Member
Posts: 174
Joined: Sun Jan 06, 2008 8:41 am

Re: Future of CPUs

Post by skyking »

berkus wrote:
skyking wrote:I'm not convinced that you'd get the best general purpose performance by using the available resources for special purpose circuitry.
RISC architectures starting from, perhaps, Commodore 64 and its MOS 6502 chip and up to TI OMAP and nVidia tegra2 boards would disagree.
Who would disagree with what? AFAIK the ARM architecture does not include integer division in the instruction set...
Post Reply