[SOLVED] Disabling interrupt, timer ticks troubles?

alix · Post by **alix** » Thu Nov 29, 2012 11:41 am

Hello,

I have a simple question, if a kernel keeps track of time ticks using IRQ0 interrupts. Shouldn't disabling interrupts and re-enabling it again messes up the kernel timer?

Shouldn't the solution be to have a mask to disable all interrupts except IRQ0? Is there is a problem in critical sections (where we usually disable interrupt) to have IRQ0 enabled?

araxestroy · Post by **araxestroy** » Thu Nov 29, 2012 11:53 am

Depends on what IRQ0's handler calls. Is the handler safe to execute in the critical section? Are the functions the handlers calls safe to execute in the critical section? The functions those functions call? What about the statements in the handler -- are they going to mess with a piece of code in the critical section that could be expecting something different if the interrupt hadn't occured?

bluemoon · Post by **bluemoon** » Thu Nov 29, 2012 12:07 pm

Using PIT alone to keep track of time is inaccurate for long time.

Normally, you should calibrate / guess / define a period T where the PIT / interrupt counter will be considered inaccurate, and resort to system clock device.
Then your gettime() may look like this:

Code: Select all

time_t last_query_time;
unsigned int PIT_counter = 0x7FFFFFFF;

time_t get_time() {
  if ( PIT_counter >= kThreshold ) { // This may be 1 second, 1 minute or anything
    last_query_time = CMOS_get_time();
    PIT_counter = 0;
  }
  return last_query_time + convert_to_time(PIT_counter);
}

void PIT_handler () {
  PIT_counter ++;
}

A network time service may also be used to adjust the CMOS clock.

Bottom line, if you re-sync the time periodically you won't be affected much by disabling interrupts.

Brendan · Post by **Brendan** » Thu Nov 29, 2012 11:34 pm

Hi,

alix wrote:I have a simple question, if a kernel keeps track of time ticks using IRQ0 interrupts. Shouldn't disabling interrupts and re-enabling it again messes up the kernel timer?

There's 2 cases here. If the kernel disables IRQs for too long (more than the amount of time between timer ticks) then timer IRQs can be lost. If an IRQ is pending, then any second, third, fourth, etc. IRQs get ignored. This would cause drift. For example, if it happens a lot you might end up thinking that 40 seconds passed when 60 seconds passed.

The other case is IRQ latency causing timer IRQ jitter, which causes precision loss. For example, if you increase a "milliseconds since the epoc" counter every 2 milliseconds exactly, then that counter could be precise to within +/- 1 millisecond. However, if IRQs are disabled (postponed) when the IRQ should happen then that counter isn't updated at exactly the right time (e.g. only updated later, when interrupts are enabled and the postponed IRQ can occur), and your counter might only be accurate to within +/- 1.5 milliseconds.

Basically, as long as you don't disable IRQs for so long that you lose IRQs and cause drift, the best case precision of your counter would be "-(time_between_IRQs/2) to +(time_between_IRQs/2 + worst_case_length_of_time_IRQs_disabled)"; which is close enough to (but not strictly equivalent to) "+/- (time_between_IRQs/2 + worst_case_length_of_time_IRQs_disabled)".

A lot of things use elapsed time (e.g. "end_time - start_time"). For this case the error caused by precision loss occurs twice. For example, you might read "start_time" immediately before the counter is incremented and get a "start_time" value with a worst case error of "+(time_between_IRQs/2 + worst_case_length_of_time_IRQs_disabled)"; and then read the "end_time" immediately after the counter is incremented and get a value with a worst case error of "-(time_between_IRQs/2)"; and then subtract these values and end up with an elapsed time that has a worst case error of "-(time_between_IRQs/2) - (time_between_IRQs/2 + worst_case_length_of_time_IRQs_disabled)" or "-(time_between_IRQs + worst_case_length_of_time_IRQs_disabled)".

In a similar way you can read "start_time" immediately after the counter is incremented and get a "start_time" value with a worst case error of "-(time_between_IRQs/2)"; and then read the "end_time" immediately before the counter is incremented and get a value with a worst case error of ""+(time_between_IRQs/2 + worst_case_length_of_time_IRQs_disabled)"; and then subtract these values and end up with an elapsed time that has a worst case error of "+(time_between_IRQs/2 + worst_case_length_of_time_IRQs_disabled) - -(time_between_IRQs/2)" or "+(time_between_IRQs + worst_case_length_of_time_IRQs_disabled)".

This means that, for elapsed time, precision is actually "-(time_between_IRQs + worst_case_length_of_time_IRQs_disabled) to +(time_between_IRQs + worst_case_length_of_time_IRQs_disabled)"; which is equivalent to "+/- (time_between_IRQs + worst_case_length_of_time_IRQs_disabled)".

For example, if an IRQ is meant to occur every 1 ms and the worst case length of time IRQs are disabled is 0.25 ms; then your elapsed time calculations may be 1.25 ms too short or 1.25 ms too long.

Now; normally an OS makes guarantees, like "a delay will never be shorter than requested (but may be longer)". To provide this guarantee when your elapsed time calculation may be 1.25 too short you need to add an extra 1.25 ms to the delay just in case. This means that (for this case), if software asks for a delay of 2 ms it may actually get a delay that is between 2 ms and 3.5 ms (and won't get a delay between 0.75 ms and 3.25 ms).

Of course an OS shouldn't disable IRQs for very long at all, and for timers like the PIT the time between IRQs is very large relative to the worst case length that of time IRQs are disabled; which means that precision loss caused by IRQ latency/jitter is negligible and could be ignored (as the time between IRQs dominates precision). However; if you start looking at things like using the local APIC timer in "TSC deadline" mode (where the precision loss caused by the timer itself may be as little as 1 nanosecond for delays longer than some minimum length) the precision loss caused by IRQ latency/jitter can be several orders of magnitude higher than the precision loss caused by the timer hardware itself. What this means is that to provide extremely precise timing, interrupt latency/jitter can be extremely important.

alix wrote:Shouldn't the solution be to have a mask to disable all interrupts except IRQ0? Is there is a problem in critical sections (where we usually disable interrupt) to have IRQ0 enabled?

This might sound confusing at first; but "CLI" causes interrupts to be postponed (e.g. postponed until you do STI) and does not cause IRQs to be ignored/lost.

If you mask an IRQ and it occurs, then it is ignored/lost (and not just postponed). Masking all IRQs except IRQ0 will (eventually) cause IRQs to be lost, which will cause devices to stop working, which will cause the OS to grind to a halt (e.g. all processes waiting for devices, and all devices waiting for their ignored/lost IRQs to be handled).

For APICs there is a "Task Priority Register" which can be used to postpone (not ignore) lower priority IRQs. Sadly this doesn't work for PIC chips.

You can also do it yourself in software. For example; design your IRQ handlers so that they test if a "postpone IRQs" flag is set and if it is they set a corresponding "IRQ was postponed" flag and exit (without running the real IRQ handler). When you clear the "postpone IRQs" flag you check all those "IRQ was postponed" flags and retry any IRQ handlers that need it. In this case you'd set your own "postpone IRQs" flag instead of doing "CLI" and might never need to use "CLI" at all.

Cheers,

Brendan

alix · Post by **alix** » Fri Nov 30, 2012 1:14 am

Thanks very much guys for the detailed explanation!

rdos · Post by **rdos** » Fri Nov 30, 2012 2:03 am

Brendan wrote: There's 2 cases here. If the kernel disables IRQs for too long (more than the amount of time between timer ticks) then timer IRQs can be lost. If an IRQ is pending, then any second, third, fourth, etc. IRQs get ignored. This would cause drift. For example, if it happens a lot you might end up thinking that 40 seconds passed when 60 seconds passed.

The other case is IRQ latency causing timer IRQ jitter, which causes precision loss. For example, if you increase a "milliseconds since the epoc" counter every 2 milliseconds exactly, then that counter could be precise to within +/- 1 millisecond. However, if IRQs are disabled (postponed) when the IRQ should happen then that counter isn't updated at exactly the right time (e.g. only updated later, when interrupts are enabled and the postponed IRQ can occur), and your counter might only be accurate to within +/- 1.5 milliseconds.

Basically, as long as you don't disable IRQs for so long that you lose IRQs and cause drift, the best case precision of your counter would be "-(time_between_IRQs/2) to +(time_between_IRQs/2 + worst_case_length_of_time_IRQs_disabled)"; which is close enough to (but not strictly equivalent to) "+/- (time_between_IRQs/2 + worst_case_length_of_time_IRQs_disabled)".

A lot of things use elapsed time (e.g. "end_time - start_time"). For this case the error caused by precision loss occurs twice. For example, you might read "start_time" immediately before the counter is incremented and get a "start_time" value with a worst case error of "+(time_between_IRQs/2 + worst_case_length_of_time_IRQs_disabled)"; and then read the "end_time" immediately after the counter is incremented and get a value with a worst case error of "-(time_between_IRQs/2)"; and then subtract these values and end up with an elapsed time that has a worst case error of "-(time_between_IRQs/2) - (time_between_IRQs/2 + worst_case_length_of_time_IRQs_disabled)" or "-(time_between_IRQs + worst_case_length_of_time_IRQs_disabled)".

In a similar way you can read "start_time" immediately after the counter is incremented and get a "start_time" value with a worst case error of "-(time_between_IRQs/2)"; and then read the "end_time" immediately before the counter is incremented and get a value with a worst case error of ""+(time_between_IRQs/2 + worst_case_length_of_time_IRQs_disabled)"; and then subtract these values and end up with an elapsed time that has a worst case error of "+(time_between_IRQs/2 + worst_case_length_of_time_IRQs_disabled) - -(time_between_IRQs/2)" or "+(time_between_IRQs + worst_case_length_of_time_IRQs_disabled)".

This means that, for elapsed time, precision is actually "-(time_between_IRQs + worst_case_length_of_time_IRQs_disabled) to +(time_between_IRQs + worst_case_length_of_time_IRQs_disabled)"; which is equivalent to "+/- (time_between_IRQs + worst_case_length_of_time_IRQs_disabled)".

For example, if an IRQ is meant to occur every 1 ms and the worst case length of time IRQs are disabled is 0.25 ms; then your elapsed time calculations may be 1.25 ms too short or 1.25 ms too long.

For high precision timing, using regular IRQs is no good. You either overload CPUs with IRQs, or get lousy precision. For delays in the 1ms area or longer, which doesn't need more than 1ms precision, you could just as well set the maximum scheduler preemption timeout to 1ms, and update those timers every time you schedule. For high precision timing, you have to read hardware resources, and keep time with better precision than 1ms. That would typically use one free-running timer, and one timer that has an IRQ and operates in one-shot mode. Since you don't want to only have one timer available, you need a list of waiting tasks and when they should be triggered. Then you program the one-shot timer with the next timeout (or the maximum period for the timer).

Czernobyl · Post by **Czernobyl** » Sun Dec 02, 2012 12:54 pm

Brendan wrote: If you mask an IRQ and it occurs, then it is ignored/lost (and not just postponed).
Brendan

Hmmm, I beg to differ. UUIAM, interrupts which occur while their respective IMR bit is set (masking) are retained by the PIC.

Edge-triggered IRQs (which is the majority of ISA legacy interrupts will thus remain in the IRR (potentially) forever, until unmasked by program and will then fire (subject to priority rules).

Similarly, a level-triggered interrupt will stay in the IRR, potentially ready to interrupt a processor when it is unmasked, unless and until the IRQ line be reset (which may be at the initiative of the device, or the processor - in a device-dependent way).

In the case of the legacy timer (8254) interrupt, I think I remember (not certain) is conventionally programmed for level detection rather than edge. Hence the interrupt request #0 is "auto-clearing" (going back to zero for half the period - when the 8254 being programmed to provide a square wave form). Thus of course timer ticks may be lost if one masks IR0 in the first PIC for too long.

The semantics of masking are (regretably) different with the LAPIC than the legacy PICs, but this is another question.

Regards

Combuster · Post by **Combuster** » Mon Dec 03, 2012 8:45 am

The legacy PIT has no EOI circuitry of its own. Hence if you would configure it as level-triggered, it will keep reraising the same IRQ until it's duty cycle is over, which could be a few hundred times in pulse mode and a really high number in square wave mode.

The PIC should be set to all edge triggered. Actually, I even doubt you can safely make it behave otherwise on all machines.

Czernobyl · Post by **Czernobyl** » Mon Dec 03, 2012 10:55 am

Concerning the PIT (legacy timer) you're right, Combuster - and I apologise for the confusion in my little gray cells. Instead I wanted to refer to the legacy keyboard interrupt (IRQ1) which on many boards is configured for level.

Combuster wrote: The PIC should be set to all edge triggered. Actually, I even doubt you can safely make it behave otherwise on all machines.

All IRQs on the historical ISA (AT bus) used to be edge triggered indeed, not configurable, but on MCA (PS2) there was a mix of level and edge triggering.

Modern mobos, at least as long as they retained some ISA baggage (or LPC or FWH) can selectively configure edge/level trigger (and high/low active) for internal interrupts. Considering only my limited sampling of boards, I find the keyboard controller to use level rather than edge. Of course, there has to be a means for program-initiated reset of the IRQ signal, which in this case is simply by reading from "port 60h" ...

OSDev.org