Future of CPUs
Re: Future of CPUs
Future of processors by 2018 according to DARPA http://news.cnet.com/8301-13924_3-20013088-64.html (its not the same news where darpa asks to nvidia to build supercomputer out of cuda gpus)
Re: Future of CPUs
I guess it's more of what's really requiered that is interresting. If you think of it which applications does really require 1024 bit arithmetics? And I don't count people that use 1024 bits just because they don't understand maths. 2^1024 is a really huge number, and 1024 bit floats means really high precision.arkain wrote:I wouldn't say that. As a concept for a computer architecture paper, I designed a completely new adder with 25% less circuits and 30% more speed than the standard carry-lookahead adder (neither circuit containing optimizations). This advantage starts at 16-bit and grows as the adder is widened. I'm fairly certain that even this design is not the best that anyone can come up with. So it's not a good idea to assume that what we currently understand to be the best will remain so in the future.darkestkhan wrote:You couldn't build a 1024 bit adder with appreciably better performance. Never mind multipliers or dividers - heck, division is already slow. At the scaling rate for division that current x86s manage (and most RISC machines are worse!), that division is gonna take you 1032 cycles. Multiplication will probably be 100.
That you're getting 30% more speed means nothing, I'd guess that you could use that design for 64-bit adders as well so your 1024-bit adder has to compare with the improved 64-bit adder. One could also question if an divider is worth it at all since it uses valuable silicon space that will almost never be used anyway (a lot of CPU's actually does not include the div instructions at all).
Re: Future of CPUs
Same here - had an i805, was powerful enough to heat the cooler to 75C - and to get into thermal throttling - before windows booted up. Replaced with more powerful and dual core c2d and it can't get a cheaper cooler up to 60C.These p4 were the most power-hungry implementations in intel's history, I have one lying in the closet and you can really warm your room in winter with that.
Re: Future of CPUs
Hmm, having read through the majority of posts here, I have but one thing to say:
"Cellular Automata" ...
Anything with 1000's of cores is starting to resemble one, no?
"Cellular Automata" ...
Anything with 1000's of cores is starting to resemble one, no?
Re: Future of CPUs
I can conceive of a use for such insane precision, try a spacial navigation system. Consider that you'd have to be able to track the precise position of a myriad of stellar objects, especially while traveling at near light speeds. But I will agree that for most practical purposes, at least for the conceivable future, anything beyond 256-bit precision isn't even scientifically useful.skyking wrote:I guess it's more of what's really required that is interesting. If you think of it which applications does really require 1024 bit arithmetics? And I don't count people that use 1024 bits just because they don't understand maths. 2^1024 is a really huge number, and 1024 bit floats means really high precision.arkain wrote:I wouldn't say that. As a concept for a computer architecture paper, I designed a completely new adder with 25% less circuits and 30% more speed than the standard carry-lookahead adder (neither circuit containing optimizations). This advantage starts at 16-bit and grows as the adder is widened. I'm fairly certain that even this design is not the best that anyone can come up with. So it's not a good idea to assume that what we currently understand to be the best will remain so in the future.darkestkhan wrote:You couldn't build a 1024 bit adder with appreciably better performance. Never mind multipliers or dividers - heck, division is already slow. At the scaling rate for division that current x86s manage (and most RISC machines are worse!), that division is gonna take you 1032 cycles. Multiplication will probably be 100.
That you're getting 30% more speed means nothing, I'd guess that you could use that design for 64-bit adders as well so your 1024-bit adder has to compare with the improved 64-bit adder. One could also question if an divider is worth it at all since it uses valuable silicon space that will almost never be used anyway (a lot of CPU's actually does not include the div instructions at all).
As for why one would implement division in hardware, although it's a relatively little-used function for most common applications, anyone doing heavy math (accounting, scientific, banking, data mining, statistics, etc) certainly appreciates the speed increase provided by performing division in silicon rather than software. One of the points to making transistors smaller is so that there is more room on the die for useful functions. It wouldn't hurt if someone came up with better algorithms for multiplication and division either.
Re: Future of CPUs
I know the decimals of PI until the 45th decimal - that's exactly the amount that you need to calculate the circumference of the universe given the width of the universe, in precision of electron-thicknesses. That's about 150 bits of precision. 256 bits is already stretching it - not to mention that 1024 is even way more excessivity.arkain wrote:I can conceive of a use for such insane precision, try a spacial navigation system. Consider that you'd have to be able to track the precise position of a myriad of stellar objects, especially while traveling at near light speeds. But I will agree that for most practical purposes, at least for the conceivable future, anything beyond 256-bit precision isn't even scientifically useful.
As for why one would implement division in hardware, although it's a relatively little-used function for most common applications, anyone doing heavy math (accounting, scientific, banking, data mining, statistics, etc) certainly appreciates the speed increase provided by performing division in silicon rather than software. One of the points to making transistors smaller is so that there is more room on the die for useful functions. It wouldn't hurt if someone came up with better algorithms for multiplication and division either.
You can only hope to use it for pure mathematics - cryptography for instance. But if you go that direction, you're better off with variable length integers (or floats) and variable-length opcodes, or software emulation (GMP for instance).
Re: Future of CPUs
I knew you'd argue that 256 was a bit much. Now given that your argument mentions the need for 150-bit precision, and the trend in CPU precision follows powers of 2, to do that 150-bit calculation, you'd still need a 256 bit CPU if you wanted to do so in a minimum of instructions.Candy wrote:I know the decimals of PI until the 45th decimal - that's exactly the amount that you need to calculate the circumference of the universe given the width of the universe, in precision of electron-thicknesses. That's about 150 bits of precision. 256 bits is already stretching it - not to mention that 1024 is even way more excessivity.arkain wrote:I can conceive of a use for such insane precision, try a spacial navigation system. Consider that you'd have to be able to track the precise position of a myriad of stellar objects, especially while traveling at near light speeds. But I will agree that for most practical purposes, at least for the conceivable future, anything beyond 256-bit precision isn't even scientifically useful.
As for why one would implement division in hardware, although it's a relatively little-used function for most common applications, anyone doing heavy math (accounting, scientific, banking, data mining, statistics, etc) certainly appreciates the speed increase provided by performing division in silicon rather than software. One of the points to making transistors smaller is so that there is more room on the die for useful functions. It wouldn't hurt if someone came up with better algorithms for multiplication and division either.
You can only hope to use it for pure mathematics - cryptography for instance. But if you go that direction, you're better off with variable length integers (or floats) and variable-length opcodes, or software emulation (GMP for instance).
In the end, I still agree with you. Higher bit precision isn't going to get us very much if anything. What we need is better hardware encodings for our instructions so that all instructions can be completed in less clock cycles. Frankly, I'd much rather have a 400Mhz CPU where all instructions take 4 clocks instead of a 4Ghz CPU where every instruction takes 40 clocks. The latter CPU is going to be more prone to high power consumption and destroying itself with heat.
- Owen
- Member
- Posts: 1700
- Joined: Fri Jun 13, 2008 3:21 pm
- Location: Cambridge, United Kingdom
- Contact:
Re: Future of CPUs
...You are aware that most processors complete the vast majority of their instructions in one clock - and modern ones (i.e. x86 circa 95+, ARM Cortex-A9, etc) multiple at a time? Hell, x86 often does instructions in zero clocks.
There are instructions which are slow, but they are often rightfully slow (for example, division); and I don't get how you assume instruction encoding has any bearing on performance.
There are instructions which are slow, but they are often rightfully slow (for example, division); and I don't get how you assume instruction encoding has any bearing on performance.
Re: Future of CPUs
Intel's new 256 bit extension: http://software.intel.com/en-us/avx/Candy wrote:I know the decimals of PI until the 45th decimal - that's exactly the amount that you need to calculate the circumference of the universe given the width of the universe, in precision of electron-thicknesses. That's about 150 bits of precision. 256 bits is already stretching it - not to mention that 1024 is even way more excessivity.arkain wrote:I can conceive of a use for such insane precision, try a spacial navigation system. Consider that you'd have to be able to track the precise position of a myriad of stellar objects, especially while traveling at near light speeds. But I will agree that for most practical purposes, at least for the conceivable future, anything beyond 256-bit precision isn't even scientifically useful.
As for why one would implement division in hardware, although it's a relatively little-used function for most common applications, anyone doing heavy math (accounting, scientific, banking, data mining, statistics, etc) certainly appreciates the speed increase provided by performing division in silicon rather than software. One of the points to making transistors smaller is so that there is more room on the die for useful functions. It wouldn't hurt if someone came up with better algorithms for multiplication and division either.
You can only hope to use it for pure mathematics - cryptography for instance. But if you go that direction, you're better off with variable length integers (or floats) and variable-length opcodes, or software emulation (GMP for instance).
Re: Future of CPUs
Why did Intel create IA-64? Why is it a VLIW architecture with 3 instructions per encoded "instruction word"? Because it's not a bottleneck?Owen wrote:... I don't get how you assume instruction encoding has any bearing on performance.
I was arguing against 1024-bit stuff, reasoning that for all but cryptography and pure mathematics, 256 bits is just about always enough for a precise answer. Of course, you also know that Intel added AES instructions, specifically for cryptography. I'm guessing we'll see 512-bit SHA3 instructions too, for the same reason.TylerAnon wrote:Intel's new 256 bit extension: http://software.intel.com/en-us/avx/
Re: Future of CPUs
Hi,
The other way to "fix" the problem is to use something like hyper-threading to hide the problems caused by poor instruction scheduling. Unfortunately most Itaniums didn't support hyper-threading - Tukwila (released in 2010) is the first Itanium to support it.
The main reason Itanium was relatively successful in specific markets (e.g. for "enterprise" class hardware) was that a lot of scalability and fault tolerance features were built into the chipsets. Basically, Itanium got where it is now despite VLIW (not because of VLIW), although there's a lot of other factors that effected both the success (in some markets) and lack of success (in all other markets).
Cheers,
Brendan
As far as I know, the idea was to shift instruction scheduling out of the CPU and into the compiler; to reduce the complexity of the silicon, and maybe get better performance (didn't happen), better power consumption (also didn't happen), or reduced development costs (still didn't happen). In general the idea fails. When Itanium was introduced there wasn't a compiler capable of doing the instruction scheduling well (and I'm not even sure if such a compiler exist now), so performance was worse than the theoretical maximum. The other (long term) problem is that optimum instruction scheduling is very CPU-dependant - code tuned for one Itanium CPU (with one set of instruction latencies, etc) isn't tuned for another Itanium CPU (with a different set of instruction latencies, etc), and therefore even if a perfect compiler capable of generating optimum code existed, the resulting code would run well on one CPU and run poorly on other CPUs. If the CPU does the instruction scheduling (e.g. out-of-order CPUs), then there's much less need to tune code for a specific CPU.Candy wrote:Why did Intel create IA-64? Why is it a VLIW architecture with 3 instructions per encoded "instruction word"? Because it's not a bottleneck?Owen wrote:... I don't get how you assume instruction encoding has any bearing on performance.
The other way to "fix" the problem is to use something like hyper-threading to hide the problems caused by poor instruction scheduling. Unfortunately most Itaniums didn't support hyper-threading - Tukwila (released in 2010) is the first Itanium to support it.
The main reason Itanium was relatively successful in specific markets (e.g. for "enterprise" class hardware) was that a lot of scalability and fault tolerance features were built into the chipsets. Basically, Itanium got where it is now despite VLIW (not because of VLIW), although there's a lot of other factors that effected both the success (in some markets) and lack of success (in all other markets).
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Future of CPUs
I'd guess that's what my (or anybody else's) old grandma would want to use a computer for...arkain wrote: I can conceive of a use for such insane precision, try a spacial navigation system. Consider that you'd have to be able to track the precise position of a myriad of stellar objects, especially while traveling at near light speeds. But I will agree that for most practical purposes, at least for the conceivable future, anything beyond 256-bit precision isn't even scientifically useful.
Question is here if the performance improvment in putting this in dedicated hardware would be much more than the improvment you'd gain in using the available resources to enhance more often used operations (maybe you even don't get increased performance at all...).As for why one would implement division in hardware, although it's a relatively little-used function for most common applications, anyone doing heavy math (accounting, scientific, banking, data mining, statistics, etc) certainly appreciates the speed increase provided by performing division in silicon rather than software.
Yes, and make them faster and reduce relative power consumtion, but the question is what's the best use of the smaller transistor size. I'm not convinced that you'd get the best general purpose performance by using the available resources for special purpose circuitry.One of the points to making transistors smaller is so that there is more room on the die for useful functions.
Re: Future of CPUs
Hi,
--Thomas
A bundle to be more precise ( A bundle is 128 bits in length). Brendan pretty much covered everything.Why is it a VLIW architecture with 3 instructions per encoded "instruction word"? Because it's not a bottleneck?
HP C/C++ compilers does a decent job .When Itanium was introduced there wasn't a compiler capable of doing the instruction scheduling well (and I'm not even sure if such a compiler exist now),
--Thomas
- Owen
- Member
- Posts: 1700
- Joined: Fri Jun 13, 2008 3:21 pm
- Location: Cambridge, United Kingdom
- Contact:
Re: Future of CPUs
But not compared to an x86 branch predictor. As it turns out, the branch predictor knows far more about the execution paths of code than compilers do.Thomas wrote:HP C/C++ compilers does a decent job .When Itanium was introduced there wasn't a compiler capable of doing the instruction scheduling well (and I'm not even sure if such a compiler exist now),
--Thomas
Re: Future of CPUs
Who would disagree with what? AFAIK the ARM architecture does not include integer division in the instruction set...berkus wrote:RISC architectures starting from, perhaps, Commodore 64 and its MOS 6502 chip and up to TI OMAP and nVidia tegra2 boards would disagree.skyking wrote:I'm not convinced that you'd get the best general purpose performance by using the available resources for special purpose circuitry.