Page 2 of 4

Re:Getting CPU speed

Posted: Thu Jan 26, 2006 4:08 am
by Solar
srg wrote: One thing I didn't understand from the BogoMips FAQ was the algorithm to actually calculate it. It points to the Linux Kernel code but I've always found the linux kernel code to be about as clear as mud.
The FAQ mentioned a standalone version at ftp://sunsite.unc.edu/pub/Linux/system/status/bogo-1.2.tar.gz. After pruning it for the bare minimum, this is what's left over:

Code: Select all

#include <stdio.h>
#include <time.h>

static void delay(int loops)
{
  long i;
  for (i = loops; i >= 0 ; i--)
    ;
}

int main(void)
{
  unsigned long loops_per_sec = 1;
  unsigned long ticks;
  
  printf("Calibrating delay loop.. ");
  fflush(stdout);
  
  while ((loops_per_sec <<= 1)) {
    ticks = clock();
    delay(loops_per_sec);
    ticks = clock() - ticks;
    if (ticks >= CLOCKS_PER_SEC) {
      loops_per_sec = (loops_per_sec / ticks) * CLOCKS_PER_SEC;
      printf("ok - %lu.%02lu BogoMips\n",
         loops_per_sec/500000,
        (loops_per_sec/5000) % 100
        );
      return 0;
    }
  }
  printf("failed\n");
  return -1;
}
Easy enough, I think. ;-)

Re:Getting CPU speed

Posted: Thu Jan 26, 2006 4:13 am
by Pype.Clicker
NAME
clock - Determine processor time

SYNOPSIS
#include <time.h>

clock_t clock(void);

DESCRIPTION
The clock() function returns an approximation of processor time used by
the program.
for those who wonders (haven't heard of that function in the standard library so far :P)
CONFORMING TO
ANSI C. POSIX requires that CLOCKS_PER_SEC equals 1000000 independent
of the actual resolution.
Running it in usermode leads to 258-260 bogomips on my system (1GHz), but i guess that's just a natural artefact ...
The way "loops_per_second" is doubled at every attempt seems odd to me. Meditate on this, i will ...

Re:Getting CPU speed

Posted: Thu Jan 26, 2006 4:14 am
by Candy
Solar wrote: Easy enough, I think. ;-)
When you include it, compile this file without optimization. You wouldn't believe how well an idle loop is optimized :)

(yep, gcc once optimized away my "memory zero loop" since all it did was write zeroes to memory that was "zero" in the first place -> but not in osdev...)

Re:Getting CPU speed

Posted: Thu Jan 26, 2006 4:38 am
by Pype.Clicker
Pype.Clicker wrote: The way "loops_per_second" is doubled at every attempt seems odd to me. Meditate on this, i will ...
okay. actualy, the system only needs to detect a number of loops that is >1sec. and then it scales the number of loops according to the time elapse. e.g. if it took 1.25 seconds to run N loops, the actual value of "loops_per_sec" will be scaled to N/1.25 ...

Yet, in an OS environment, what implementation of [tt]clock()[/tt] shall we use ? RDTSC gives no clue on seconds ... using the PIC instead (polling mode, not interrupt-based, i mean) ?

Re:Getting CPU speed

Posted: Thu Jan 26, 2006 4:40 am
by Solar
Pype.Clicker wrote: The way "loops_per_second" is doubled at every attempt seems odd to me. Meditate on this, i will ...
Why odd? You start with 4 loops and check if they take longer than one second. Then you try 8, 16, 32, 64 until you have a value of [tt]loops_per_second[/tt] that is the first in the series taking longer than a second. Means, you won't spend more than two seconds finding that value.

Divide that value (number of loops) by the number of clock() ticks it took, and multiply with CLOCKS_PER_SEC to get the number of loops you'd have to do to waste extactly one second. You just calculated your BogoMips.

That way, a slow machine won't spend ages in the calibration loop. ;-)

Of course, in kernel space you'd have to worry about where to get clock() ticks and the CLOCKS_PER_SEC value. That's where you pick up the Intel manual and read about the CPU#s MSR time stamp counter and how to read it. ;)

Re:Getting CPU speed

Posted: Thu Jan 26, 2006 5:10 am
by Pype.Clicker
Solar wrote: u try 8, 16, 32, 64 until you have a value of That way, a slow machine won't spend ages in the calibration loop. ;-)
true. i ran the test with stop at >=CLOCKS_PER_SEC, 2*CPS and 4*CPS and the value it computed for "loops_per_sec" was exactly the same: 132,000,000 loops on my 1GHz ... suggesting than a higher value wouldn't have improved the precision either...
Of course, in kernel space you'd have to worry about where to get clock() ticks and the CLOCKS_PER_SEC value. That's where you pick up the Intel manual and read about the CPU#s MSR time stamp counter and how to read it. ;)
I feared you'd say that ... yet (section 15.7 of Intel Manuals, for those who wonder)
Following reset, the (time stamp) counter is incremented every processor clock cycle, even when the processor is halted by the HLT instruction or the external STPCLK# pin. However, the assertion of the external DPSLP# pin may cause the time-stamp counter to stop and Intel SpeedStep? technology transitions may cause the frequency at which the time-stamp counter increments to change in accordance with the processor's internal clock frequency.
so there's no relation with that and seconds ... except the cpu speed. Similar problem comes with the local IO Apic, which can provide high-resolution timers using the front side bus frequency (but unless you know that frequency, you cannot tell how much ticks the local IO APIC will need to wait for e.g. 50?s ...)

Re:Getting CPU speed

Posted: Thu Jan 26, 2006 5:40 am
by Solar
But you can reset the time stamp counter, set up a timer interrupt, and read the time stamp counter to see how often that one "ticks" per second.

Of course, at that point you have the "CPU speed", but as the information of how many MHz you have isn't useful for the OS (see above) and you want to know how many delay loops per second the CPU does, you'd still have to calculate your BogoMips. ;-)

Re:Getting CPU speed

Posted: Thu Jan 26, 2006 3:34 pm
by srg
errrr....

So would this code require a CPU with time stamp counter? or is this just the clock() function?

What about on a 386?

Re:Getting CPU speed

Posted: Thu Jan 26, 2006 4:21 pm
by Solar
The clock() function needs some non-locking way to measure passing time. Dunno what's available or not on the 386.

Re:Getting CPU speed

Posted: Thu Jan 26, 2006 4:34 pm
by srg
Solar wrote: The clock() function needs some non-locking way to measure passing time. Dunno what's available or not on the 386.
nothing on chip

Re:Getting CPU speed

Posted: Thu Jan 26, 2006 5:31 pm
by nick8325
srg wrote:
Solar wrote: The clock() function needs some non-locking way to measure passing time. Dunno what's available or not on the 386.
nothing on chip
You can tell how many cycles each instruction will take to execute on 386, though - google for 386intel and look at the instruction reference. Perhaps you could run some long instruction lots of times and see how long it takes. That just leaves the 486...

Re:Getting CPU speed

Posted: Thu Jan 26, 2006 7:05 pm
by Brendan
Hi,
nick8325 wrote:You can tell how many cycles each instruction will take to execute on 386, though - google for 386intel and look at the instruction reference. Perhaps you could run some long instruction lots of times and see how long it takes. That just leaves the 486...
If you select the instruction well, the same timing code can be used for 386 and 486. I'd pick a small instruction like "shl eax,1", which costs 3 cycles on both 80386 and 80486, and then do a lot of them (being careful not to use too many L1 cache lines), with some tricks for the loop itself. For example:

Code: Select all

get_old_CPU_speed:
     push ecx

     mov eax,IRQhandlerA
     call set_timer_IRQ_handler        ;Change IRQ handler

     jmp $                             ;Do nothing until first timer IRQ occurs

endTiming:
     mov eax,ecx
     pop ecx
     ret



     align 32                          ;Make sure it starts on it's own cache line

startTiming:
     clr ecx                           ;Clear counter and cause initial cache miss
.wait:
     times 66 shl eax,1                ;3 cycles (total of 198 cycles)
     inc ecx                           ;2 cycles on 386, 1 cycle on 486
     jmp .wait                         ;8 cycles on 386, 3 cycles on 486



IRQhandlerA:
     mov eax,IRQhandlerB
     call set_timer_IRQ_handler        ;Change IRQ handler

     mov al,0x20
     out 0x20,al
     add esp,12                        ;Clean stack (remove CS, EIP and EFLAGS)
     sti
     jmp startTiming



IRQhandlerB:
     mov eax,original_IRQhandler
     call set_timer_IRQ_handler        ;Restore original IRQ handler

     mov al,0x20
     out 0x20,al
     add esp,12                        ;Clean stack (remove CS, EIP and EFLAGS)
     sti
     jmp endTiming
For an 80386 this adds up to 207 cycles per iteration, while for 80486 it's 201 cycles.

I'd set the PIT for slightly longer than 25 ms. For example, a PIT count of 30000 would give you 25.14 ms between IRQs. Then I'd calculate the CPUs frequency:

[tt] CPUfreq = count * 1/0.025[/tt]

This will always be slightly too high because of differences in the number of cycles for 80386 and 80486, and the slightly longer amount of time between IRQs. Because it's always slightly too high, it's perfect for rounding down to the nearest 1/3 MHz. After this it should be 100 % accurate.

Of course you'd need to disable all IRQs except for the timer itself to get a reliable measurement, but this shouldn't be a problem - I assume you'd only want to measure it once during boot (don't have to worry about things like SpeedStep or LongRun for these CPUs).


Cheers,

Brendan

Re:Getting CPU speed

Posted: Fri Jan 27, 2006 3:01 am
by Kemp
Watch out for those instruction timing references though, I've heard of certain instructions (can't remember which off-hand, it's been a while) taking anywhere up to 5 cycles more or less than stated (either because of misprints or because of certain conditions being met or not).

Re:Getting CPU speed

Posted: Fri Jan 27, 2006 3:32 am
by Solar
I just dug deep into the glibc web-CVS, to find out how they determine their CLOCKS_PER_SECS, but as usual with that nightmare of a library there are no answers, only cross-references. ??? 8)

Re:Getting CPU speed

Posted: Fri Jan 27, 2006 8:02 am
by distantvoices
Have tried durands code just now and it works perfectly. :-)

returns the correct speed for one of my cpu's (i have a smp machine here but my kernel isn't smp yet)

Just going to implement a lance network driver now.

stay safe