Page 1 of 1

Weird spikes in ISR jitter measurements using TSC

Posted: Tue Feb 09, 2010 4:33 am
by limp
Hi all,

I have a very annoying problem when taking jitter measurements using the TSC on an Atom 330 (dual-core).

I have implemented a project that all it does is to generate an interrupt every ms using the LAPIC timer and I am using the TSC to calculate the periodic jitter of this interrupt. The "funny" thing is that I am getting a spike of about 4.5 ms every about 507 ticks (i.e 0.507 sec). I tried various things to find the cause of it but nothing worked. The spikes were still there. Some things I tried are listed below:

- Generate the interrupt using the PIT timer instead of LAPIC timer.
- Generate the interrupt using the HPET timer and also take measurements using HPET.

I tried to change the interrupt (tick) period from 1 ms to 2 and to 3 ms and see the results.
I show that in the case of 2 ms I was getting the spikes every 255 ticks, in the case of 3ms every 171 ticks.
So, the spikes are generated approximately every 0.5 seconds, no mater how often the interrupt occurs.

Any help will be much much appreciated.

Kind Regards,

Re: Weird spikes in ISR jitter measurements using TSC

Posted: Tue Feb 09, 2010 12:49 pm
by Selenic
A periodic, timer-independent delay makes it seem likely that something is stealing processor time off you, causing delayed/missing interrupts, yet is not being counted as part of the TSC (so if it's supposed to be monotonic, then my guess here is wrong and it's even more confusing).

If it *is* stolen processor-time, then it's probably SMM* (because it's responsible for stuff like thermal throttling, so it always runs, and it has to steal some processor time to do its work) but 4ms seems like a long time for the tasks that that does.

* System Management Mode

Re: Weird spikes in ISR jitter measurements using TSC

Posted: Tue Feb 09, 2010 3:58 pm
by limp
Selenic wrote:A periodic, timer-independent delay makes it seem likely that something is stealing processor time off you, causing delayed/missing interrupts, yet is not being counted as part of the TSC (so if it's supposed to be monotonic, then my guess here is wrong and it's even more confusing).

If it *is* stolen processor-time, then it's probably SMM* (because it's responsible for stuff like thermal throttling, so it always runs, and it has to steal some processor time to do its work) but 4ms seems like a long time for the tasks that that does.

* System Management Mode
Really thanks for the reply! I'll try to disable it and see if this affects somehow the measuremnets. The thing is that I disabled every monitoring from BIOS so I am not really sure if SMM is causing the spikes (it should be disabled), but eitherway I appreciate your response. The other thing is that I was getting spikes close to 8 ms so it makes me wonder if SMM can delay the system that much. Any other proposal is welcomed.

Regards,

Re: Weird spikes in ISR jitter measurements using TSC

Posted: Tue Feb 09, 2010 5:33 pm
by DednDave
i don't know what OS you are using
but, under windows, the HAL layer periodically synchronizes the TSC's for the different cores (supposedly)
one thing that may help is to bind the thread to a single core
it might help to know your OS ?

Re: Weird spikes in ISR jitter measurements using TSC

Posted: Tue Feb 09, 2010 5:55 pm
by bontanu
limp wrote: ...
Really thanks for the reply! I'll try to disable it and see if this affects somehow the measuremnets. The thing is that I disabled every monitoring from BIOS so I am not really sure if SMM is causing the spikes (it should be disabled), but eitherway I appreciate your response. The other thing is that I was getting spikes close to 8 ms so it makes me wonder if SMM can delay the system that much. Any other proposal is welcomed.

Regards,
You can not disable SMM (not in a safe way anyway). Disabing monitoring and emulation from BIOS will not disable SMM.

Bassically SMM owns the machine no matter what OS is running on top of it. IF you disable SMM then you risk damaging or disabling your hardware. As said above SMM deals with hardware critical tasks like: termal throttling, device emulation, etc.

It can steal a lot more cycles that what you see there beacuse the transition from whatever mode your own OS is running into towards the SMM mode is costly (it is a kind of primitive VT-x transition) and the operations performed inside SMM mode are sometimes delayed by slow hardware devices and buses that have to be accessed and syncronized before SMM can return to your OS (USB and PS/2, LAPIC, I2C bus for fan control, etc).

In order to disable SMM you must have exact and intimate knowledge of all of your hardware devices and firmware.

Overall: disabling SMM is a very risky and complicated operation that is theoretically possible but practically unsound and one that gives you very little in return if any (but it might be **interesting**).

Re: Weird spikes in ISR jitter measurements using TSC

Posted: Tue Feb 09, 2010 10:03 pm
by Brendan
Hi,
bontanu wrote:Overall: disabling SMM is a very risky and complicated operation that is theoretically possible but practically unsound and one that gives you very little in return if any (but it might be **interesting**).
Yes, but there's things you can do to minimise the amount of work done by SMM.

At the top of the list would be supporting ACPI (or at least pretending to support ACPI) so that power management is done by your code (and ACPI's AML) and not done by SMM code.

Not using "PS/2 keyboard/mouse emulation" (either disabling it in the BIOS, using PS/2 keyboard/mouse instead of USB devices, or disabling it in the USB controller) is another thing to consider. In a similar way, on some systems HPET is used to emulate PIT and (part of) the RTC, so supporting HPET (and disabling any legacy timer emulation) could help. Apart from that, there's not too much left for SMM to do (especially considering there aren't any Atom systems that support ECC).


Cheers,

Brendan

Re: Weird spikes in ISR jitter measurements using TSC

Posted: Wed Feb 10, 2010 11:52 am
by Selenic
bontanu wrote:Overall: disabling SMM ... might be **interesting**
Is that the educational or explosive kind of 'interesting'? :P

All I can do here is echo (and sum up) what's already been said: don't even try to disable SMM (I, at least, don't want to be entrusting my processor's 'life' to an in-development OS!); the best you can do is to try to do reduce the amount of work it needs to do, by doing it yourself. The transitions to and from SMM might still take a fair amount of time if you don't offload everything, though, due to the aforementioned slow hardware.
How about checking the jitter on the second (non-boot) core as well? That one might not be doing as much SMM work, so it might be better. Not sure about this; after all, with SMM it's largely hardware-specific.
Brendan wrote:At the top of the list would be supporting ACPI (or at least pretending to support ACPI) so that power management is done by your code (and ACPI's AML) and not done by SMM code.
I'd guess that, even if you take over power management via ACPI, then SMM would still steal some time off you to make sure that you don't kill the hardware, but it'd probably reduce the portion of time stolen (the throttling might be the part that takes most of the time)

Re: Weird spikes in ISR jitter measurements using TSC

Posted: Wed Feb 10, 2010 3:06 pm
by limp
Thank you all for your replies!!

I also build a Linux kernel that has been configure for 1 ms ticks and LAPIC timer has used for generating the system's tick.
I then used the exact same code that I am using in my "generate interrupt" project to measure the periodic jitter present under Linux (on the same target of course) and guess what..there were no spikes at all!

Is there a possibility that Linux is disabling something that I am not? If they're not disabling SMM either, then what else could it be?

If someone has some spare time and a dual core target, could it be easy to try implementing a very simple project that generates a ms interrupt and perform some simple periodic jitter measurements to see if this thing is global or happens only to me? I already tried it to two Intel Atom 330 systems and took the same spikes.

I invested too much time on it and it's really essential to me but unfortunately isn't going anywhere...really appreciate the input that you're putting guys!

King regards,

Re: Weird spikes in ISR jitter measurements using TSC

Posted: Wed Feb 10, 2010 3:38 pm
by Combuster
Is there a possibility that Linux is disabling something that I am not? If they're not disabling SMM either, then what else could it be?
Brendan wrote:(...) At the top of the list would be supporting ACPI (or at least pretending to support ACPI) so that power management is done by your code (and ACPI's AML) and not done by SMM code.
(hint hint)

Re: Weird spikes in ISR jitter measurements using TSC

Posted: Sun Feb 14, 2010 3:14 pm
by limp
As it proved, SMM was responsible for the spikes that I was getting. I just want to thank you guys for all your useful comments that helped me fixing that.

Regards

Re: Weird spikes in ISR jitter measurements using TSC

Posted: Tue Feb 16, 2010 8:06 am
by JamesM
This is interesting, because I had exactly the same result of my similar investigation. I hadn't thought of SMM :(

What I found (Pretty graphs! 8) )is here - apologies for the format - it's a tar.gz with a .dvi and several .eps's.

I haven't got pdflatex to successfully convert and and embed PS graphics yet! (not that I've given it much thought, this report was over 3 weeks ago).

The interesting stuff are the first and second graphs.

EDIT: As background, I was investigating the efficiency of different types of context switching. The system timer frequency was varied and with the result seen on different algorithm's ability to do work efficiently. As the system timer speed increases (past about 10KHz) there are 'ladders' for want of a better word appearing in the graph, seemingly indicating that certain frequencies cause very poor performance. No idea yet what's causing it!

By the way, before people ask about error / heating up of the chip etc, each point seen on each graph is actually a cluster of 10 (or 5, in the last graph) repeats. The error is so tiny that this cannot be seen at all.

Re: Weird spikes in ISR jitter measurements using TSC

Posted: Tue Feb 16, 2010 8:30 am
by mutex
hi,

Would be interresting to know what you did about SMM to improve / remove your the problem..

Did you manage to disable SMM or what?

Some mentioned that disabling SMM might damage the computer.

I was under the impression that the SMM when it comes to power only did support tasks for power management and some power efficiency stuff for laptops etc. I heard that Intel (not AMD) had thermal protection into it so that it could not be damaged by overheating. For the chipset etc i dont know. Maby its here people are right that it might BURN the computer..?

-
Thomas

Re: Weird spikes in ISR jitter measurements using TSC

Posted: Tue Feb 16, 2010 8:59 am
by Owen
Intel processors have a built-in feedback loop between the chip's thermal sensor and phase locked loop (What generates the clock) to throttle back the operating frequency as temperature increases. AMDs have a hard cutout after a certain temperature. Both of these are, for practical reasons, implemented in hardware (I imagine they don't want to be paying for warranty replacements over defective SMM code ;-))