OSDev.org

Posted: **Sat Sep 16, 2006 11:54 am**

I was thinking does a 3.6GHz single core processor really execute 3600000000000 instruction a second?
Also in brute output how fast a single cored processor you would need (aprox.) to out perform a 2GHz dual core proc.
and how does Hyper threading that gives a single cored work I heard that it was by alowing the ALU to for example if one instruction was a add and the next was a do div it would use the two parts of the ALU simultaniously.
does this mean that HT is better than dual core?
Finally how does Hyper transport work and do in AMD procs is it a similar thing?
Thanks in advance,
Jules

Posted: **Sat Sep 16, 2006 12:06 pm**

I was thinking does a 3.6GHz single core processor really execute 3600000000000 instruction a second?

No, it goes through that many cycles per second, each instruction generally takes longer than a single cycle to execute (anywhere between 1 and a hundred or so depending on the particular instruction and processor)

Also in brute output how fast a single cored processor you would need (aprox.) to out perform a 2GHz dual core proc.

That's incredibly dependant on what you're actually running on the processor.

and how does Hyper threading that gives a single cored work I heard that it was by alowing the ALU to for example if one instruction was a add and the next was a do div it would use the two parts of the ALU simultaniously.

AFAIK, in hyper-threading some parts of the cpu are duplicated and the rest of the effect is pretty much what you mentioned, parts not being used for instructions on one "core" are used for instructions on the other "core".

does this mean that HT is better than dual core?

HT only pretends to have two cores, dual core actually does have two, thus it should perform better.

Posted: **Sat Sep 16, 2006 4:14 pm**

This is off topic.... Is AMD Athlon 64 X2 Dual-Core, Model Number:5000+, the best desktop processor AMD has?

Posted: **Sat Sep 16, 2006 4:41 pm**

Heh, no idea, but if it is then it won't be within a month...

Posted: **Sun Sep 17, 2006 5:56 am**

and about hyper transport and wats the instruction which takes the most sicles in a pentium D?
Thank in advance,
Jules

Posted: **Sun Sep 17, 2006 6:46 am**

Finally how does Hyper transport work

Why don't you just google for it ? There's also some decent wikipedia article about the topic..

Whats the instruction which takes the most cycles in a pentium D?

You can't say that in general as the performance of any instruction depends on a lot of criteria. Calculation the absolute speed of any processor by something like CpuClockRate/AverageInstructionLength doesn't work and won't even give you a rough idea of its actual performace. My suggestion is that you just have a look at a couple of real-word benchmarks to get any idea how fast a certain processor really is in everydays life.

Also in brute output how fast a single cored processor you would need (aprox.) to out perform a 2GHz dual core proc.

- A single threaded application can only use one of the cores. If a dual-core exclusivly runs such an application, the maximum performance won't be any better than on a regular unicore processor with the same clock-rate
- Several single threaded applications can together use both processor cores. The combined performance for all applications is thus roughly the same as for a twice as fast unicore system
- Multi threaded applications can also use both cores at the same time. Depending on the level of paralellism it might however be that they can't keep both cores busy all the time
- Multicore systems have a small overhead compared to unicore systems as more synchronisation is needed
- Bad multicore support by the operating system may slow the system down

regards,
gaf

Posted: **Sun Sep 17, 2006 9:29 am**

suthers wrote: I was thinking does a 3.6GHz single core processor really execute 3600000000000 instruction a second?

You need to read up on metrix prefixes.

Hz = 1-1000 per second
KHz = 1000-1000000 per second
MHz = 1000000-1000000000 per second
GHz = 1000000000-1000000000000 per second

Your 3.6GHz processor executes 3600000000 cycles per second.

Depending on its architecture and its internal layout, it on average does about 0.8-1.2 instructions per cycle. Its actual performance is then between 3200000000 and 4000000000 instructions per second. Yes, that's damn fast.

Special architectures can outperform that significantly. Take for instance Itanium, which can execute up to 2 bundles per cycle, with 3 instructions per bundle. It's then hardwired to 6 instructions per cycle (IPC). It's available only at lower clock speeds and the compiler needs to work out how to parallelize the instructions though.

On the other hand, some architectures are meant to be low-power and reuse circuitry for different operations or different parts of a cycle. The PIC series for example are hardwired to 0.25 IPC, since every instruction (with one exception) takes 4 cycles, without being scalar.

Note that instructions can still be longer themselves, but their individual processing time can be decreased by making the processor do 2 things at the same time. That's called making a design "scalar", in that it starts the next instruction before this one is done. Highly scalar is for example the Pentium-4 processor, with (iirc) 24 steps in its pipeline.

Newer processors can also be superscalar, which means that it has multiple pipelines that can each accept one or more instructions per cycle. This is how you can get an IPC of more than one. Of course, you need to check, verify and be entirely certain that an instruction can be executed, but those things are very doable (given a few million transistors to figure it out).

Also in brute output how fast a single cored processor you would need (aprox.) to out perform a 2GHz dual core proc.

Depends a whole lot on the processor itself, the design, what you're using to say it "outperforms" the other, whether the task is highly parallel, serial, dependant, limited by other means or anything.

and how does Hyper threading that gives a single cored work I heard that it was by alowing the ALU to for example if one instruction was a add and the next was a do div it would use the two parts of the ALU simultaniously.

Read the above bit on superscalarity. Intel found out that they had a pretty low IPC because of pipeline stalls, so that most of the time most units were doing nothing. Now, if you add a separate stream of instructions that's completely not dependant on the first (a second thread, for instance) you could use them too and get higher performance. Being as optimistic as they always are, they estimated it could double your performance. It's not that good, but it'll still improve your performance if not running the exact same program on both (and even then it could).

On the P4 HT stuff, the P4 ALU is double-clocked meaning it can do two instructions each cycle. It also has two ALU's, so it can do 4 ALU-type instructions per cycle in any case.

does this mean that HT is better than dual core?

It's worse. HT is duplicating just the thread state. Dual core is duplicating the entire processor, except for the biggest cache, and putting them in the same packaging. Some early dual core processors were just two processors shoved into one package with a tiny bit of extra logic to make them appear as dual core rather than two processors.

Finally how does Hyper transport work and do in AMD procs is it a similar thing?

How HT works: look up.

AMD processors don't do HT, they do pure dual core. They duplicate all the logic, share the caches and share the connectivity to other processors. With a bit of smart logic, knowing that each AMD64 cpu core contains a memory bus interface and three hypertransport connections, you can figure out how to make it work. Connect a hypertransport from one to the other, connect 3 of the leftover hypertransports to the package, make a memory controller that can serve two cpu's independantly and leave the last Hypertransport line dangling. Then, convince management to create AM2 marketing and finally use the dangling Hypertransport connection

.

Note, somebody screwed up hypertransport and hyperthreading. When talking about Intel, it's hyperthreading, when talking about AMD or Via it's hypertransport.

- Bad multicore support by the operating system may slow the system down

Ignorant Intel engineers cause bad operating system performance. When you use two cores on one cache, don't make it two-way.

Posted: **Sun Sep 17, 2006 10:18 am**

A question from me now, lol. What is the actual difference with AM2? Other than the different socket it appears to be identical to the 939 X2s.

Posted: **Sun Sep 17, 2006 11:23 am**

Kemp wrote: A question from me now, lol. What is the actual difference with AM2? Other than the different socket it appears to be identical to the 939 X2s.

As far as I know, HT3 and the fourth hypertransport link. It might also contain more power pins to supply more power-hungry cpu's of power, and I think it was done (will be done) in a similar way to Intel's LGA775, as opposed to the pins approach.

Posted: **Sun Sep 17, 2006 11:25 am**

one costs more ;D

http://www.anandtech.com/cpuchipsets/sh ... spx?i=2688

Posted: **Sun Sep 17, 2006 2:35 pm**

I also read somewhere that at the moment, the various CPUs available for AM2 were just the same processors with the appropriate modifications to allow it to work in the socket. Now this may be either incorrect, and is certainly a couple months old so I can't speak for its accuracy.

Posted: **Mon Sep 18, 2006 9:33 am**

How much extra efficientsy does core 2 duo gi8ve compared to dual core?

Posted: **Mon Sep 18, 2006 9:49 am**

Core 2 Duo is dual core...

Posted: **Mon Sep 18, 2006 10:08 am**

Back to the car analogy for simplicitly.

There are ferrari's (say, Intel Core 2 stuff), Porsches (AMD Hammer stuff), Formula 1 cars (P4's - they rev really high but don't go as fast as some road cars), trucks (Itanium - low rev, high throughput), road trains (Sun Niagara - 32 things at once) and small cars (Via C3). Each of these has its points that are good and its points that are bad. If I were to go to the shop, I wouldn't take a ferrari, porsche or formula one car, since you can't properly park them. If I were to go to a race circuit, depending on which side of it I'd be on, I'd either bring my favorite Ferrari, Porsche, Formula 1 car or I'd bring a van of people along. When I'm looking for a way to transport my whole household to a new place, I'm going for a truck.

There is no better or worse. For any given application or work load (in a double sense) one or the other can be better or worse. You can't say that a ferrari is 300% better or worse than a family car, so you can't say whether a core 2 is going to be better than a random dual core processor.

Posted: **Mon Sep 18, 2006 11:28 am**

Kemp wrote: What is the actual difference with AM2? Other than the different socket it appears to be identical to the 939 X2s.

I think another point with AM2 is the support of DDR2 memory.

OSDev.org

processor information

processor information

Re:processor information

Re:processor information

Re:processor information

Re:processor information

Re:processor information

Re:processor information

Re:processor information

Re:processor information

Re:processor information

Re:processor information

Re:processor information

Re:processor information

Re:processor information

Re:processor information