Hi all,
I am doing some performance measurements for my hobby OS, and I am obtaining some weird results in my test machine. I am performing my measurements using the RDTSC instruction who loads the timestamp counter value to eax.
So say, for example, if under bochs I get 17 clocks for a bunch of code (nothing special there, some mov's and jmps, no CPL changing nor cr3 reloading), under my test machine (an Athlon XP) I get 128 clocks (!!!???). A back-and-forth task switch measuring 260 clocks under bochs gives me more than 2000 under my test machine, even with TLB flush penalty this seems a lot to me ... I checked caching is enabled and all pages are cacheable, so this is not the issue: if I disable cacheable pages the former task-switch example raises to more than 65000 clocks... any ideas of what I am missing here?
Strange performance problem
RE:Strange performance problem
To get an accurate timestamp value using RDTSC,you need to use an instruction that forces the CPU to finish what each pipeline is performing,ie Serializing instructions.
Check out
http://www.cs.uit.no/inf2200/2003h/stuf ... /RDTSC.htm
and
http://www.math.uwaterloo.ca/~jamuir/rdtscpm1.pdf
and
http://www.google.com/search?hl=en&ie=U ... gle+Search
Check out
http://www.cs.uit.no/inf2200/2003h/stuf ... /RDTSC.htm
and
http://www.math.uwaterloo.ca/~jamuir/rdtscpm1.pdf
and
http://www.google.com/search?hl=en&ie=U ... gle+Search