Is there a real instruction counter on modern x86 CPUs?

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
User avatar
Artlav
Member
Member
Posts: 178
Joined: Fri Aug 21, 2009 5:54 am
Location: Moscow, Russia
Contact:

Is there a real instruction counter on modern x86 CPUs?

Post by Artlav »

After some headbanging and a bit of late googling i discovered that on modern x86 CPUs the RDTSC instruction does not return actual instruction count, and is instead normalized to some system frequency.

This means that as the CPU clock gets changed by nice things like turbo boost, the counter keeps ticking at the same rate.
It no longer measures how many instructions a CPU performed between two invocations, but is now just a clock.

And a clock is not useful when you try to benchmark a bit of code on an x86 tablet that can vary between 1.1GHz and 2.4GHz with no way of nailing it down.

So the question is, is there some new instruction, MSR, or something that holds an actual instruction count?
Octocontrabass
Member
Member
Posts: 5578
Joined: Mon Mar 25, 2013 7:01 pm

Re: Is there a real instruction counter on modern x86 CPUs?

Post by Octocontrabass »

It sounds like you want RDPMC, with a performance counter configured to count retired instructions.

I haven't used performance counters before so I can't do much besides directing you to the Intel manual, volume 3B chapters 18 and 19.
User avatar
Artlav
Member
Member
Posts: 178
Joined: Fri Aug 21, 2009 5:54 am
Location: Moscow, Russia
Contact:

Re: Is there a real instruction counter on modern x86 CPUs?

Post by Artlav »

Yep, that worked. Thanks.

So, the sequence is:
-Write 0xB2 to MSR 0x38D (IA32_FIXED_CTR_CTRL), to enable the instruction counter.
-Set ECX to 0x40000000 (Selects which counter to use)
-RDPMC
-Result in EDX:EAX
To disable counter, write 0 to the same MSR.

The instruction would GPF in userspace by default, you need to set PCE bit in CR4 (bit 8 ) to allow it to be used at lower privilege levels.
Post Reply