suthers wrote:
I was thinking does a 3.6GHz single core processor really execute 3600000000000 instruction a second?
You need to read up on metrix prefixes.
Hz = 1-1000 per second
KHz = 1000-1000000 per second
MHz = 1000000-1000000000 per second
GHz = 1000000000-1000000000000 per second
Your 3.6GHz processor executes 3600000000 cycles per second.
Depending on its architecture and its internal layout, it on average does about 0.8-1.2 instructions per cycle. Its actual performance is then between 3200000000 and 4000000000 instructions per second. Yes, that's damn fast.
Special architectures can outperform that significantly. Take for instance Itanium, which can execute up to 2 bundles per cycle, with 3 instructions per bundle. It's then hardwired to 6 instructions per cycle (IPC). It's available only at lower clock speeds and the compiler needs to work out how to parallelize the instructions though.
On the other hand, some architectures are meant to be low-power and reuse circuitry for different operations or different parts of a cycle. The PIC series for example are hardwired to 0.25 IPC, since every instruction (with one exception) takes 4 cycles, without being scalar.
Note that instructions can still be longer themselves, but their individual processing time can be decreased by making the processor do 2 things at the same time. That's called making a design "scalar", in that it starts the next instruction before this one is done. Highly scalar is for example the Pentium-4 processor, with (iirc) 24 steps in its pipeline.
Newer processors can also be superscalar, which means that it has multiple pipelines that can each accept one or more instructions per cycle. This is how you can get an IPC of more than one. Of course, you need to check, verify and be entirely certain that an instruction can be executed, but those things are very doable (given a few million transistors to figure it out).
Also in brute output how fast a single cored processor you would need (aprox.) to out perform a 2GHz dual core proc.
Depends a whole lot on the processor itself, the design, what you're using to say it "outperforms" the other, whether the task is highly parallel, serial, dependant, limited by other means or anything.
and how does Hyper threading that gives a single cored work I heard that it was by alowing the ALU to for example if one instruction was a add and the next was a do div it would use the two parts of the ALU simultaniously.
Read the above bit on superscalarity. Intel found out that they had a pretty low IPC because of pipeline stalls, so that most of the time most units were doing nothing. Now, if you add a separate stream of instructions that's completely not dependant on the first (a second thread, for instance) you could use them too and get higher performance. Being as optimistic as they always are, they estimated it could double your performance. It's not that good, but it'll still improve your performance if not running the exact same program on both (and even then it could).
On the P4 HT stuff, the P4 ALU is double-clocked meaning it can do two instructions each cycle. It also has two ALU's, so it can do 4 ALU-type instructions per cycle in any case.
does this mean that HT is better than dual core?
It's worse. HT is duplicating just the thread state. Dual core is duplicating the entire processor, except for the biggest cache, and putting them in the same packaging. Some early dual core processors were just two processors shoved into one package with a tiny bit of extra logic to make them appear as dual core rather than two processors.
Finally how does Hyper transport work and do in AMD procs is it a similar thing?
How HT works: look up.
AMD processors don't do HT, they do pure dual core. They duplicate all the logic, share the caches and share the connectivity to other processors. With a bit of smart logic, knowing that each AMD64 cpu core contains a memory bus interface and three hypertransport connections, you can figure out how to make it work. Connect a hypertransport from one to the other, connect 3 of the leftover hypertransports to the package, make a memory controller that can serve two cpu's independantly and leave the last Hypertransport line dangling. Then, convince management to create AM2 marketing and finally use the dangling Hypertransport connection
.
Note, somebody screwed up hypertransport and hyperthreading. When talking about Intel, it's hyperthreading, when talking about AMD or Via it's hypertransport.
- Bad multicore support by the operating system may slow the system down
Ignorant Intel engineers cause bad operating system performance. When you use two cores on one cache, don't make it two-way.