HPET problem

Craze Frog · Post by **Craze Frog** » Mon Mar 30, 2009 3:23 am

The HPET triggers when the "timer" equals "comparator". So to set it to trigger some time from now, we can add a value to the timer and store it in the comparator, like this:

Code: Select all

cli
newvalue = timer + 10
comparator = newvalue
sti

Looks nice until system management mode starts running in the middle of your code:

Code: Select all

cli
newvalue = timer + 10
  --- SMM runs For more than 10 timer units here
comparator = newvalue
sti

in this case, timer is already bigger than newvalue when we set the comparator. So the timer is bigger than the comparator, and needs to run a whole cycle (wrap) before it triggers again. This makes the system stop receiving timer interrupts for about 5 minutes on a 32-bit system (the HPET runs rather fast).

How to go about solving this problem?

jal · Post by **jal** » Mon Mar 30, 2009 5:03 am

Craze Frog wrote:How to go about solving this problem?

Errr... use a >= instead of ==?

JAL

Craze Frog · Post by **Craze Frog** » Mon Mar 30, 2009 7:13 am

Unfortunately I can't go around and rewire everyones hardware. The HPET is made by computer manufacturers after Intel's specifications, which specifically says that the interrupt is to be triggered when the timer counter and the match value are exactly equal.

Brendan · Post by **Brendan** » Mon Mar 30, 2009 9:22 am

Hi,

Craze Frog wrote:Looks nice until system management mode starts running in the middle of your code:
Code: Select all
cli
newvalue = timer + 10
  --- SMM runs For more than 10 timer units here
comparator = newvalue
sti
in this case, timer is already bigger than newvalue when we set the comparator. So the timer is bigger than the comparator, and needs to run a whole cycle (wrap) before it triggers again. This makes the system stop receiving timer interrupts for about 5 minutes on a 32-bit system (the HPET runs rather fast).

How to go about solving this problem?

To avoid this race condition, you could double check. For example:

Code: Select all

    delay = ?;

    for(;;) {
        cli
        newvalue = timer + delay;
        --- SMM runs for more than "delay" timer units here
        comparator = newvalue;
        if(timer < newvalue) break;   // Comparator must have been set in time
        if(interrupt_status_flag == sent) break;   // Comparator must have been set in time
        // Comparator wasn't set in time, so retry
        delay = 1;
        sti
    }

Note: You'd need to use a level triggered interrupt for the timer, otherwise the corresponding flag in the General Interrupt Status Register won't be set and you won't be able to tell (easily) if the IRQ was sent or not.

Cheers,

Brendan

Craze Frog · Post by **Craze Frog** » Mon Mar 30, 2009 10:51 am

To avoid this race condition, you could double check.

That's what I thought, I just couldn't find a bulletproof way.

And I've got a new question now: I understand why the interrupt must be level-triggered, but how do I know if an interrupt is level-triggered?

The General Interrupt Status Register I haven't heard about, and I can't find it by searching the manual from Intel. Is it on the local APIC? Won't it be flagging even if some other interrupt is pending?

Unfortunately there are still a problem with the code:
- If the SMM code runs so long that the timer wraps around (could happen when the timer is just about to wrap), then (timer < newvalue) and we will break; even though the comparator was not set in time. I think we may need to take the difference and see if it's within allowed limits, but that doesn't sound very elegant.

Thanks for helping me so far.

jal · Post by **jal** » Mon Mar 30, 2009 1:04 pm

Craze Frog wrote:If the SMM code runs so long that the timer wraps around (could happen when the timer is just about to wrap), then (timer < newvalue) and we will break; even though the comparator was not set in time. I think we may need to take the difference and see if it's within allowed limits, but that doesn't sound very elegant.

Who says dealing with this stuff needs to be elegant? :) But seriously, if the second value is smaller than the original value, you know there's been a wrap around, so don't do newvalue = timer + 10; comperator = newvalue, but orgvalue = time; newvalue = orgvalue + 10; comperator = newvalue, and then compare timer with orgvalue: if timer < orgvalue than a wraparound occurred.

JAL

Craze Frog · Post by **Craze Frog** » Mon Mar 30, 2009 1:51 pm

But seriously, if the second value is smaller than the original value, you know there's been a wrap around

Yes, but what do I use that information for? Wrapping isn't illegal, just look at the following situation:

The timer wraps at 100. Currently it is 80. I want to wait for 50 ticks, so I set the comparator to 30. But before I could set the comparator to 30, SMM ran for 40 ticks, so the timer is 20 when the comparator is set to 30. Now everything is really OK, but it could be detected as a failure condition.

Also, there is also a race condition when checking if things were set correctly, which hasn't been considered yet. Look at your algorithm:
orgvalue = time;
newvalue = orgvalue + 10;
comparator = newvalue;
<<< What if the wraparound happens here? >>>
if timer < orgvalue then a wraparound occurred.
The wraparound would be detected even if the wraparound occured AFTER the comparator was correctly set.

The only way I can think of at the moment is to have a global variable that lets you know whether the interrupt actually has occured (let it be set before setting the timer and cleared in the interrupt handler). Now, if the seemed to be failure condition on setting the timer, enable interrupts, disable interrupts and check if the variable got cleared even though there was a failure (this would be because the overrun happened after the comparator was set). If the variable was still set, re-set the timer (or jump directly to the interrupt handler now that we're on overtime anyways).

JamesM · Post by **JamesM** » Mon Mar 30, 2009 10:12 pm

Hi,

Admittedly I haven't read anything about the HPET so please take my suggestion with a pinch of salt / flame a correction, however...

In some other systems with count/compare timers (I'm talking here Local APIC and the MIPS count/compare system) it is possible to write to the count register (i.e. set it). So instead of incrementing the compare register by $x every interrupt, and have to deal with wraparound etc., It's possible to have compare set statically, and just reset the count register to 0 on every interrupt. That way, you lose the race condition you mention.

Whether that's at all possible with the HPET timer I have no idea, I just thought I might mention it.

Cheers,

James

Craze Frog · Post by **Craze Frog** » Tue Mar 31, 2009 3:06 am

Unfortunately, it's not possible to write to the count register.

jal · Post by **jal** » Tue Mar 31, 2009 4:40 am

Craze Frog wrote:The timer wraps at 100. Currently it is 80.

The timer wraps at 2^32, or 2^64 in 64-bit mode. Since you already mentioned it takes 5 minutes for a full wrap in 32 bit mode, and since SMM code will never, ever, ever take that long, I still really don't see the problem.

JAL

Craze Frog · Post by **Craze Frog** » Tue Mar 31, 2009 8:21 am

jal wrote:
Craze Frog wrote:The timer wraps at 100. Currently it is 80.
The timer wraps at 2^32, or 2^64 in 64-bit mode. Since you already mentioned it takes 5 minutes for a full wrap in 32 bit mode, and since SMM code will never, ever, ever take that long, I still really don't see the problem.

JAL

Which problem is it that you don't see? SMM doesn't have to run for five minutes to make the counter wrap.

PS. The linux kernel developers don't see the problem either, which is why the linux kernel may randomly hang for 5 minutes under "unfortunate" workloads. http://mywiki.ncsa.uiuc.edu/wiki/VMware ... ystem_Hang

jal · Post by **jal** » Tue Mar 31, 2009 1:08 pm

Craze Frog wrote:Which problem is it that you don't see? SMM doesn't have to run for five minutes to make the counter wrap.

No, but it would need to run for five minutes to make it wrap twice. Therefore, even if you do not expect a wrap, if a wrap happened, SMM is to blame. Not too difficult to check then.

JAL

Brendan · Post by **Brendan** » Tue Mar 31, 2009 11:32 pm

Hi,

jal wrote:No, but it would need to run for five minutes to make it wrap twice. Therefore, even if you do not expect a wrap, if a wrap happened, SMM is to blame. Not too difficult to check then.

Imagine you want to setup the comparator so that an IRQ occurs in 16 ticks, and the current time is 0xFFFFFFF8. In this case simple code (like the code I posted) will fail. For example:

Code: Select all

    delay = 16;

    for(;;) {
        cli
        newvalue = timer + delay;            //newvalue = 0xFFFFFFF8 + 16 = 0x00000008
        comparator = newvalue;               //comparator = 0x00000008
        if(timer < newvalue) break;   // Comparator must have been set in time *WRONG*
        if(interrupt_status_flag == sent) break;   // Comparator must have been set in time
        // Comparator wasn't set in time, so retry
        delay = 1;
        sti
    }

To avoid this problem, I should've written something like:

Code: Select all

    delay = ?;

    for(;;) {
        cli
        starttime = timer;
        comparator = starttime + delay;
        elapsedtime = timer - starttime;
        if(esapsedtime < delay) break;   // Comparator must have been set in time
        if(interrupt_status_flag == sent) break;   // Comparator must have been set in time
        // Comparator wasn't set in time, so retry
        delay = 1;
        sti
    }

In this case the variables you use must be the same size as the counter (e.g. unsigned 32-bit or 64-bit) so that "elapsedtime = timer - starttime" will be correct despite overflows (for e.g. "0x00000001 - 0xFFFFFFF8 = 0x00000009").

Cheers,

Brendan

Craze Frog · Post by **Craze Frog** » Wed Apr 01, 2009 12:57 pm

Brendan, the problem with your second code is in this line:

Code: Select all

if(esapsedtime < delay) break;   // Comparator must have been set in time

The comment is correct, but it doesn't hold the other way. That is, the comparator could very well have been set in time even if elapsedtime > delay if SMM happened to run after the comparator was correctly set, but before the timer was read again for checkup.

I'm not taking the interrupt status thing into account here.

Maybe I'll just give up on this, it seems like a waste of time <-pun

.

Brendan · Post by **Brendan** » Thu Apr 02, 2009 12:15 am

Hi,

Craze Frog wrote:Brendan, the problem with your second code is in this line:
Code: Select all
if(esapsedtime < delay) break;   // Comparator must have been set in time
The comment is correct, but it doesn't hold the other way. That is, the comparator could very well have been set in time even if elapsedtime > delay if SMM happened to run after the comparator was correctly set, but before the timer was read again for checkup.

I'm not taking the interrupt status thing into account here.

If the comparator was set in time but "esapsedtime >= delay", then the HPET would've tried to send an IRQ and (for single CPU systems) the interrupt status flag will be set. The only problem is if the HPET sends an IRQ and the IRQ handler clears the interrupt status flag, which can't happen on a single-CPU system with interrupts disabled.

For multi-CPU the code I posted doesn't work - a different CPU (with interrupts enabled) could have serviced the interrupt and cleared the interrupt status flag before the first CPU reaches the "if(interrupt_status_flag == sent)" line. In this case you'd need something to determine if the IRQ was serviced - for e.g. make the IRQ handler increment a "HPET_IRQs_serviced" counter before it does anything to clear the interrupt status flag in the HPET, then add an extra check in the code to setup the comparator. For e.g.:

Code: Select all

    delay = ?;

    for(;;) {
        cli;
        startIRQ = HPET_IRQ_counter;
        starttime = timer;
        comparator = starttime + delay;
        elapsedtime = timer - starttime;
        if(esapsedtime < delay) break;   // Comparator must have been set in time (IRQ not sent yet)
        if(interrupt_status_flag == sent) break;   // Comparator must have been set in time (IRQ sent but not handled yet)
        if(HPET_IRQ_counter != startIRQ) break;   // Comparator must have been set in time (IRQ sent and handled)
        // Comparator wasn't set in time, so retry
        delay = 1;
        sti;
    }

It'd be easier if you didn't need the IRQ handler at all. For example, if you want to execute a function after a certain delay and could call that function directly or from inside the IRQ handler. In this case you could:

Code: Select all

    delay = ?;

        cli;
        startIRQ = HPET_IRQ_counter;
        starttime = timer;
        comparator = starttime + delay;
        elapsedtime = timer - starttime;
        if( (esapsedtime >= delay) && (interrupt_status_flag != sent) && (HPET_IRQ_counter != startIRQ) ) {
            sti;
            do_timer_thing();
        }
        sti;

Or alternatively:

Code: Select all

    delay = ?;

    if(delay < FOO) {
        starttime = timer;
        endtime = starttime + delay;
        if(endtime < starttime) {       // If we need to wait for timer to overflow
            while(timer > 0x80000000) { /* PAUSE? */ }
        }
        while(timer < endtime) { /* PAUSE? */ }
        do_timer_thing();
    } else {
        cli;
        starttime = timer;
        comparator = starttime + delay;
        sti;
    }

However, in this last case "FOO" would need to be large enough to avoid problems with SMM, but small enough to avoid wasting too much time polling. Probably the best code for performance would be a mixture:

Code: Select all

    delay = ?;

    if(delay < FOO) {
        starttime = timer;
        endtime = starttime + delay;
        if(endtime < starttime) {       // If we need to wait for timer to overflow
            while(timer > 0x80000000) { /* PAUSE? */ }
        }
        while(timer < endtime) { /* PAUSE? */ }
        do_timer_thing();
    } else {
        cli;
        startIRQ = HPET_IRQ_counter;
        starttime = timer;
        comparator = starttime + delay;
        elapsedtime = timer - starttime;
        if( (esapsedtime >= delay) && (interrupt_status_flag != sent) && (HPET_IRQ_counter != startIRQ) ) {
            sti;
            do_timer_thing();
        }
        sti;
    }

In this case FOO can be very small (set FOO so that time spent polling is aways less than the overhead of handling an IRQ).

Also, you'd want to look into quantizing effects - for e.g. if you ask for a 450 ns delay and HPET's main counter is incremented every 100 ns, then (normally) you'd want to make sure that the delay is at least 450 nS and you'd have to settle for a delay that's actually between 500 ns and 600 ns (because you don't want a delay that's actually between 400 ns and 500 ns, as you can't guarantee that the delay is long enough). Note: for HPET the main clock is 10 MHz or faster, so each "tick" is 100 ns or less.

Cheers,

Brendan

OSDev.org

HPET problem

HPET problem

Re: HPET problem

Re: HPET problem

Re: HPET problem

Re: HPET problem

Re: HPET problem

Re: HPET problem

Re: HPET problem

Re: HPET problem

Re: HPET problem

Re: HPET problem

Re: HPET problem

Re: HPET problem

Re: HPET problem

Re: HPET problem