Future of CPUs

Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
User avatar
Owen
Member
Member
Posts: 1700
Joined: Fri Jun 13, 2008 3:21 pm
Location: Cambridge, United Kingdom
Contact:

Re: Future of CPUs

Post by Owen »

Death of video cards? Huh?

As for the "specialized CPUs" comment, I have only seen this happen on the highest of high end NICs and sound cards. The GPU is itself these days an array of very highly specialized processors.

And more GPUs are shipping now than ever before (They're going into supercomputers - for example, for physics, nothing beats an nVIDIA GPU and CUDA)
User avatar
Love4Boobies
Member
Member
Posts: 2111
Joined: Fri Mar 07, 2008 5:36 pm
Location: Bucharest, Romania

Re: Future of CPUs

Post by Love4Boobies »

Owen wrote:Death of video cards? Huh?
Erm... CPUGPU packages? As such?
As for the "specialized CPUs" comment, I have only seen this happen on the highest of high end NICs and sound cards. The GPU is itself these days an array of very highly specialized processors.
Who cares where you've seen it? That is the direction in which we are heading - another reason for that is that SMP isn't very scalable (quite obviously). Few know this, but even i7 is ccNUMA.
And more GPUs are shipping now than ever before (They're going into supercomputers - for example, for physics, nothing beats an nVIDIA GPU and CUDA)
I don't have any benchmarks but I'm not sure about that...
"Computers in the future may weigh no more than 1.5 tons.", Popular Mechanics (1949)
[ Project UDI ]
User avatar
Owen
Member
Member
Posts: 1700
Joined: Fri Jun 13, 2008 3:21 pm
Location: Cambridge, United Kingdom
Contact:

Re: Future of CPUs

Post by Owen »

Larrabee as a discrete product has been canned. If Larrabee is seen at all, it will be as a replacement for the GMA series.
User avatar
Love4Boobies
Member
Member
Posts: 2111
Joined: Fri Mar 07, 2008 5:36 pm
Location: Bucharest, Romania

Re: Future of CPUs

Post by Love4Boobies »

It was. It's an important step for Intel and they can't afford to screw it up because it's the thing everyone's talking about. Arrandale (some of the Core i3, Intel Core i5 and Intel Core i7 CPUs) already contain integrated GPUs. It's a question of when. :)
"Computers in the future may weigh no more than 1.5 tons.", Popular Mechanics (1949)
[ Project UDI ]
User avatar
Neolander
Member
Member
Posts: 228
Joined: Tue Mar 23, 2010 3:01 pm
Location: Uppsala, Sweden
Contact:

Re: Future of CPUs

Post by Neolander »

I don't see this happening soon, and hope it won't happen except for the lowest end of the GPU market (GMA, you won't be missed).

1/A graphics card is not just about a chipset. That's what laptop manufacturers try to make people think, and everybody can see the horrible performance of laptops in games. Buses and fast dedicated video memory are an important issue, too, and you can't make all of this fit on a single chip, nor you can use the regular bus for video memory (it's already running out of bandwidth these days, that would be a performance suicide).
2/Heat is a major issue, too. Modern higher-end graphic cards generate so much heat that they take the room of one PCI-e card just for the cooler. Making them shrink to the size of a regular CPU won't help that much, it'll only reduce ease of heat dissipation due to reduced contact surface with the cooling device.
3/Graphics cards evolve faster than regular CPUs. If I buy a computer today, I know that I probably won't need to nor be able to change its CPU. On the other hand, PCI-e evolves slowly enough to envision buying a new graphics card if I need to in the lifetime. That's why it's better for graphics and CPU to be separate parts. Otherwise, people wanting a new GPU will have to buy a new CPU along with it (and a new motherboard, since changing the CPU means changing it nowadays). On the other hand, people wanting to build a powerful PC will have to buy two GPUs : one integrated in the CPU, and one with serious performance as a separate part.

In my opinion, PC hardware manufacturers are trying to take consumer screwing-up lessons from Apple with this, and Intel/AMD are trying to screw up Nvidia, which has little to no experience in the CPU market. I don't like it.
Last edited by Neolander on Sun Apr 11, 2010 1:22 pm, edited 1 time in total.
User avatar
Love4Boobies
Member
Member
Posts: 2111
Joined: Fri Mar 07, 2008 5:36 pm
Location: Bucharest, Romania

Re: Future of CPUs

Post by Love4Boobies »

Don't argue with the technical aspects when it has already been done. Clearly, everything has been worked out.

As for modularity, yes, it's less modular. You get a lower total price for better performance. How often do you change your CPU and how often do you change your video card today?
"Computers in the future may weigh no more than 1.5 tons.", Popular Mechanics (1949)
[ Project UDI ]
User avatar
Neolander
Member
Member
Posts: 228
Joined: Tue Mar 23, 2010 3:01 pm
Location: Uppsala, Sweden
Contact:

Re: Future of CPUs

Post by Neolander »

Has it already been done with a serious graphics card ? Something which provides at least the power and capabilities of a 5 years old graphics card like the Geforce 7800 GT ?
Last edited by Neolander on Sun Apr 11, 2010 1:16 pm, edited 1 time in total.
User avatar
Owen
Member
Member
Posts: 1700
Joined: Fri Jun 13, 2008 3:21 pm
Location: Cambridge, United Kingdom
Contact:

Re: Future of CPUs

Post by Owen »

Current generation CPU, maximum memory bus size: 192-bit (Core i7)
Current generation GPU, maximum memory bus size: 384-bit (nVIDIA GTX480)

The GPU has its memory bus clocked higher, because it doesn't need to transit sockets to get to RAM.

Both of these busses are absolutely full, and the GPU one is over twice the capacity of the CPU one.

Please reconcile with a high performance GPU implemented in the same silicon as the CPU ;)
User avatar
Love4Boobies
Member
Member
Posts: 2111
Joined: Fri Mar 07, 2008 5:36 pm
Location: Bucharest, Romania

Re: Future of CPUs

Post by Love4Boobies »

You are right, it simply cannot be done. That's why Larrabee had a 1024-bit ring bus. :wink:
Owen wrote:Please reconcile with a high performance GPU implemented in the same silicon as the CPU ;)
Okay, but just because your posts are as enlightning as always. :wink:
"Computers in the future may weigh no more than 1.5 tons.", Popular Mechanics (1949)
[ Project UDI ]
User avatar
Owen
Member
Member
Posts: 1700
Joined: Fri Jun 13, 2008 3:21 pm
Location: Cambridge, United Kingdom
Contact:

Re: Future of CPUs

Post by Owen »

Love4Boobies wrote:You are right, it simply cannot be done. That's why Larrabee had a 1024-bit ring bus. :wink:
Owen wrote:Please reconcile with a high performance GPU implemented in the same silicon as the CPU ;)
Okay, but just because your posts are as enlightning as always. :wink:
I was under the impression you were suggesting the inclusion of GPUs on the same chip as the main processor ;)

Larrabee is another kettle of fish entirely. A highly disappointing kettle of fish.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Future of CPUs

Post by Brendan »

Hi,
Owen wrote:Current generation CPU, maximum memory bus size: 192-bit (Core i7)
Current generation GPU, maximum memory bus size: 384-bit (nVIDIA GTX480)
Most people care about bandwidth, not bus size.

The fastest RAM bandwidth for AMD/ATI GPUs that is listed on wikipedia is Radeon HD 5870 at 153.6 GiB/s. The fastest RAM bandwidth for NVidia GPUs that is listed on wikipedia is GeForce GTX 480 at 177.4 GiB/s.

For Intel's Nehalem (not Nehalem-EX), RAM bandwidth is currently about 32 GiB/s. That's about the same as a NVidia GeForce GT 330, or an ATI Radeon HD 4670.

But, you wouldn't want to use all the RAM bandwidth for video and leave none for the CPU/s. If you use half for GPU and half for CPU then you're left with about 16 GiB/s, which is about the same as a NVidia GeForce GT 220 or an ATI Radeon 9700.

However, how good is "good enough"? For servers nobody cares. For office work nobody cares. For mobile devices nobody cares. For "moderate" PC gamers like me it's hard to say - I checked my games machine (which is plenty for every game I've tried) and realised it's using system RAM anyway (a quad core Phenom II with onboard ATI HD3200).

The "hard core gamers" who actually would have cared all shifted to games machines like xBox years ago. This is mostly because "latest release" PC games suck (there's continual compatibility problems).

Of course I should point out that the fastest bandwidth isn't RAM at all - it's Nehalem's L1/L2/L3 caches. Recently I've been wondering what sort of graphics performance you'd be able to get from using highly optimised code on a 4-core/8-CPU Nehalem (instead of using a GPU) using different algorithms than what GPUs use (to get the most out of cache bandwidth, etc). If you compare GFLOPS and RAM bandwidth (and ignore CPU cache), Nehalem works out roughly the same as an NVidia GeForce 9600. Next year we'll have AVX ("256-bit SSE"), which should double the CPU's "GFLOPS per core".


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
User avatar
Owen
Member
Member
Posts: 1700
Joined: Fri Jun 13, 2008 3:21 pm
Location: Cambridge, United Kingdom
Contact:

Re: Future of CPUs

Post by Owen »

Don't forget that modern GPUs have caches (And TLBs!) too ;) They also have built in real time texture decompression hardware (The textures remain compressed in the cache for space efficiency - For DXT1, a 4x4 RGB + Mask pixel block is compressed into 64-bits. For DXT3, 5 and 3DC (The latter being targeted at normal maps) a 4x4 RGBA block is compressed into 128 bits).

I haven't seen x86 sprout any DXT1DEC/DXT3DEC/DXT5DEC/3DCDEC instructions yet ;-)

DXT and 3DC are one of those things which are incredibly cheap and fast in hardware but relatively expensive in software.

Believe me, GPUs have the same bandwidth issues as CPUs, only at higher scales. Their caches are probably not much larger than the i7, or even smaller, but they're very targeted. The fragment scheduling hardware is very good at maximizing cache locality.
User avatar
Love4Boobies
Member
Member
Posts: 2111
Joined: Fri Mar 07, 2008 5:36 pm
Location: Bucharest, Romania

Re: Future of CPUs

Post by Love4Boobies »

Owen wrote:Don't forget that modern GPUs have caches (And TLBs!) too ;) They also have built in real time texture decompression hardware (The textures remain compressed in the cache for space efficiency - For DXT1, a 4x4 RGB + Mask pixel block is compressed into 64-bits. For DXT3, 5 and 3DC (The latter being targeted at normal maps) a 4x4 RGBA block is compressed into 128 bits).
Maybe someday they will add caches and TLBs to CPUs as well, eh?
I haven't seen x86 sprout any DXT1DEC/DXT3DEC/DXT5DEC/3DCDEC instructions yet
Of course you haven't. They have only had CPU cores. I'm not sure what part of this conversation you don't understand.
:wink:
:wink:
"Computers in the future may weigh no more than 1.5 tons.", Popular Mechanics (1949)
[ Project UDI ]
User avatar
Owen
Member
Member
Posts: 1700
Joined: Fri Jun 13, 2008 3:21 pm
Location: Cambridge, United Kingdom
Contact:

Re: Future of CPUs

Post by Owen »

Love4Boobies wrote:Of course you haven't. They have only had CPU cores. I'm not sure what part of this conversation you don't understand.
I was commenting on Brendan's post. In particular, the point about algorithms to use Nehalem's cache.
Benk
Member
Member
Posts: 62
Joined: Thu Apr 30, 2009 6:08 am

Re: Future of CPUs

Post by Benk »

64 bit mem is enough for address for data Modern CPUs already do 128 bit and 256 bit ( for some) in some MMX/SSE instructions eg the 128 bit XMM registers can be used for the types of tasks that need it .

Predictions ( in other words guesses) :


modern Quad CPU board can run 48 concurrent threads ( 4 * 6 * 2 ( HT) ) , in the near future you will get core counts go past 100 , in 10 years we could be looking at 1000 core desktops.

TLB and cache pollution are a critical problem.

Removing concurrency in the IPC design will become more important and allow average programs to use high core counts. This suggest Asyc IPC , a new runtime lib and no user threads even.

OSX , Intel , Linux and NT will not handle this well nor can they easily adapt ( INtel can) which will make it interesting. Witness the cost of the minor multi core improvements in Windows 7. A new OS designed for these things could gain a 10 fold advantage for average apps . Some languages like Ocml will blow past C on those systems at least with a similar amount of effort , many of the user threading based custom high speed work will need to be redone.

MMU and branch prediction will start to dissapear after 10 yeas to be replaced with 35% higher core counts and more advanced software MM ( which use Virtual addresses and patches the code when needed) .

Mobile CPU and smaller to remain 32 bit.
Post Reply