Strange performance problem

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
GFL

Strange performance problem

Post by GFL »

Hi all,

  I am doing some performance measurements for my hobby OS, and I am obtaining some weird results in my test machine.  I am performing my measurements using the RDTSC instruction who loads the timestamp counter value to eax.
So say, for example, if under bochs I get 17 clocks for a bunch of code (nothing special there, some mov's and jmps, no CPL changing nor cr3 reloading), under my test machine (an Athlon XP) I get 128 clocks (!!!???).  A back-and-forth task switch measuring 260 clocks under bochs gives me more than 2000 under my test machine, even with TLB flush penalty this seems a lot to me ... I checked caching is enabled and all pages are cacheable, so this is not the issue: if I disable cacheable pages the former task-switch example raises to more than 65000 clocks... any ideas of what I am missing here?
CodeSlasher

RE:Strange performance problem

Post by CodeSlasher »

To get an accurate timestamp value using RDTSC,you need to use an instruction that forces the CPU to finish what each pipeline is performing,ie Serializing instructions.
Check out
http://www.cs.uit.no/inf2200/2003h/stuf ... /RDTSC.htm
and
http://www.math.uwaterloo.ca/~jamuir/rdtscpm1.pdf
and
http://www.google.com/search?hl=en&ie=U ... gle+Search
Post Reply