Best way to manage PIT

ScropTheOSAdventurer · Wed Jun 04, 2014 1:16 pm

Hello! I added PIT support (shortly after getting interrupts online), and then added a higher-level timing system onto that that enables me to have multiple virtual timers going at any given time. The way it works is that I have a linked list of timers that, whenever the PIT interrupt line is fired (which happens about every 10 milliseconds btw since I set it at 100 hz), every timer is checked (as long as one isn't being added or removed) to see if the "tick" for it is at or above the "limit" that is set for that specific timer. If it is, the callback specified for that timer is fired, and the tick is set to 0. In any case though, the tick is then incremented after all this.

My question is simple: When I start using the PIT for serious stuff, such as task switching and the like (in terms of importance), should I go with a less fragile approach, such as an ARRAY of timers where each timer has a unique ID, or a hardwired approach where my PIT code just calls a single callback that does everything I want? Or is it just a matter of taste?

Brendan · Post by **Brendan** » Wed Jun 04, 2014 4:10 pm

Hi,

ScropTheOSAdventurer wrote:Hello! I added PIT support (shortly after getting interrupts online), and then added a higher-level timing system onto that that enables me to have multiple virtual timers going at any given time. The way it works is that I have a linked list of timers that, whenever the PIT interrupt line is fired (which happens about every 10 milliseconds btw since I set it at 100 hz), every timer is checked (as long as one isn't being added or removed) to see if the "tick" for it is at or above the "limit" that is set for that specific timer. If it is, the callback specified for that timer is fired, and the tick is set to 0. In any case though, the tick is then incremented after all this.

My question is simple: When I start using the PIT for serious stuff, such as task switching and the like (in terms of importance), should I go with a less fragile approach, such as an ARRAY of timers where each timer has a unique ID, or a hardwired approach where my PIT code just calls a single callback that does everything I want? Or is it just a matter of taste?

Typically the goal isn't to emulate many separate timers (e.g. each with their own count, frequency, etc), but to use the same timer for any number of independent delays, such that when any of the delays expire you call its callback and remove that delay completely (and don't reset its count).

Also, you want a suitable abstraction so that the software using it doesn't need to care if the kernel is actually using the PIT, or HPET, or local APIC timer or something else.

Now; imagine the kernel keeps track of "number of nanoseconds since boot" using a 64-bit integer (so that it won't roll over for 584 years). In that case software could say "call this callback when number of nanoseconds since boot is greater than or equal to expiryTime" without caring what type of hardware is being used. Software can "expiryTime = now + delayLength"; but you can also have a function to convert a "calendar time" into nanoseconds since boot - e.g. maybe you want to call a function at 1:23 pm on the 4th of May in 2016, so you convert that into a "nanoseconds since boot" and create a delay that expires then. In the (rare) cases where something does want a callback that's called regularly they can just create a new delay when their old one expires, like this:

Code: Select all

#define DELAY      12345678     // Call the callback every 12345678 nanoseconds
uint64_t expiryTime;

void init(void) {
    expiryTime = getNanosecondSinceBoot() + DELAY;
    createDelay(myFunction, expiryTime);
}

void myFunction(void) {
    expiryTime += DELAY;
    createDelay(myFunction, expiryTime);
}

For "call this callback when number of nanoseconds since boot is greater than or equal to expiryTime"; you want to make sure the code that handles the timer can quickly find the delay/s that will expire the earliest (you don't want to be searching a large list within the timer's IRQ handler). This makes it easier/faster for your PIT IRQ handler to determine which delays have expired. It also makes it easy to use "one shot" mode in the PIT, where (after expired delays are handled) the IRQ handler determines how long until the earliest delay will expire and sets the PIT's count, so that the next IRQ will occur when the next delay expires (and so that there's no unnecessary IRQs in between). This method also means you can get about 1 us precision out of the PIT (instead of 10 ms precision).

Finally; for some things (e.g. lots of time-outs used in networking) you don't care much how precise the delay is and you can combine delay expiry times to improve efficiency. For example, if the scheduler has asked for a delay that expires in exactly 12345 ns and there's a TCP/IP delay that expires in approximately 10000 ns, then you can combine them so that CPU is interrupted once for both (after exactly 12345 ns) rather than being interrupted twice (once after 10000 ns and then again after 12345 ns).

To do all of this I'd use buckets. For an example, imagine if you've got 256 buckets and do "bucketNumber = (expiryTime / 100000) & 0xFF;" to determine which bucket a delay should be put into. Then every 100000 ns you'd get the next bucket, find the delays in that bucket that expire in the next 100000 ns and remove them, then sort them to create a sorted list that your timer IRQ handler can use for the next 100000 ns. This means that you only sort a small number of the total delays at a time, and only do the sorting when it's necessary. It also means that if delay/s are cancelled before they get close to expiring then they get removed before you spend CPU time sorting them and not after; which is important because (on a system with decent networking load) a lot of the delays are networking time-outs and almost all of them are cancelled before they expire.

Cheers,

Brendan

ScropTheOSAdventurer · Thu Jun 05, 2014 10:17 am

In terms of abstraction, in my HAL, I have "Drivers" and "Interactors." The "Drivers" do the dirty work of dealing with the hardware. The "Interactors" provide an abstraction interface between the kernel and the "Drivers." I created a PIT "Driver" and then added my timer "Interactor" for it, with the latter managing the actual timer system I spoke of. So, I could initialize a timer in my kernel like so:

Code: Select all

Timer_init(); /* Initializes the PIT and gets the system ready. However, this is done in my HAL initialization.  */ 
Timer thetimer; 
thetimer.tick = 0; /* Play it safe and initialize the tick to zero. */ 
thetimer.limit = 100; /* Roll over and call the callback once a second or so (one tick = about 10 milliseconds). */ 
thetimer.func = TestCallback; 
Timer_Register_Timer(&thetimer); /* Enroll it in the timing system */ 
/* Some time later..... */ 
Timer_DeRegister_Timer(&thetimer); /* Take it out */

The idea you floated (especially with enabling "one shot" mode) sounds enticing to me. I'll take a look at it! Thanks!

Brendan · Post by **Brendan** » Thu Jun 05, 2014 8:11 pm

Hi,

ScropTheOSAdventurer wrote:In terms of abstraction, in my HAL, I have "Drivers" and "Interactors." The "Drivers" do the dirty work of dealing with the hardware. The "Interactors" provide an abstraction interface between the kernel and the "Drivers." I created a PIT "Driver" and then added my timer "Interactor" for it, with the latter managing the actual timer system I spoke of. So, I could initialize a timer in my kernel like so:

Yes; but it can be tricky to design an abstraction that suits many different devices. For example; for recent 80x86 CPUs the local APIC timer has a "TSC deadline" mode (and the TSC runs at a fixed frequency that's typically 1 GHz or faster). This is the best possible timing hardware on 80x86 - e.g. it has low overhead, extremely good (1 ns or better) precision, and (for multi-CPU) it solves scalability problems because there's a local APIC timer in each CPU. However, it's very much like "one shot" mode - when the timer's IRQ occurs you set the next deadline.

Now, if you use timers in periodic ("fixed fequency") mode, then it ends up being a compromise between precision and overhead - to improve precision you have to make overhead worse (increase the number of IRQs per second) and to improve overhead you have to make precision worse (decrease the number of IRQs per second) - you can't have both. For example, you could setup the local APIC to generate 1 million IRQs per second (thousands of times more overhead than "TSC deadline mode") and you'd get 1 us precision (a thousand times less precision than "TSC deadline mode").

ScropTheOSAdventurer wrote:The idea you floated (especially with enabling "one shot" mode) sounds enticing to me. I'll take a look at it! Thanks!

There's one thing I probably should mention. For PIT, in "one shot" mode you have to set a new count every time the IRQ occurs, which means doing (relatively slow) IO port writes. If the PIT is in the "low byte then high byte" access mode you'd need to do 2 IO port writes to set the count every IRQ, and (for overhead) this isn't so nice. Basically, if you do 3 IO port writes (2 for PIT count and one for PIC chip EOI) and they cost a total of 3 us of overhead, then the minimum time between IRQs would be 3 us.

Instead, you can put the PIT into "low byte only" access mode or "high byte only" access mode so that you only need to do one IO port write to set the new count. In this case you end up with 2 IO port writes (one to set PIT count and one for PIC chip EOI) and end up with a minimum time between IRQs of 2 us.

For the "low byte only" access mode you get about 838 ns precision out of it, but the maximum count becomes 0x00FF so if no IRQ is needed you'd set the count to max. and end with a maximum time between IRQs of 213 us. This means you get a minimum of 4479 IRQs per second (whether you want the IRQs or not), even though you're using "one shot" mode. For 2 us of overhead every IRQ with a minimum of 4479 IRQs per second, you get a best case of 8958 us of overhead per second. Of course the worst case overhead is an IRQ (2 us of overhead) every 2 us, or 100%.

For the "high byte only" access mode you'd get about 213 us precision out of it, and the maximum count becomes 0xFF00, the maximum time between IRQs is about 54.71 ms. This means you get a minimum of 18.2 IRQs per second (whether you want the IRQs or not), even though you're using "one shot" mode. For 2 us of overhead every IRQ with a minimum of 18.2 IRQs per second, you get a best case of 36.4 us of overhead per second. Because the precision is so much less, the minimum time between IRQs becomes 213 us too; and in that case the worst case overhead is an IRQ (2 us of overhead) every 213 us, or 0.94%.

Basically, the choices are:

low byte then high byte - 838 us precision, minimum of 18.2 IRQs per second, 3 us minimum time between IRQs, best case of 54.6 us of overhead per second (0.00546%), worst case overhead is 100% (minimum time between IRQs equal to overhead per IRQ)
low byte only - 838 ns precision, minimum of 4479 IRQs per second, 2 us minimum time between IRQs, best case of 8958 us of overhead per second (0.8958%), worst case overhead is 100%
high byte only - 213 us precision, minimum of 18.2 IRQs per second, 213 us minimum time between IRQs, best case of 36.4 us of overhead per second (0.00364%), worst case overhead is 0.94% (minimum time between IRQs much larger than overhead per IRQ)

For comparison, here's a few "fixed frequency" estimates:

100 Hz fixed frequency - 10 ms precision, 100 us of overhead per second (0.01%)
1000 Hz fixed frequency - 1 ms precision, 1000 us of overhead per second (0.1%)
4479 Hz fixed frequency - 213 us precision, 4479 us of overhead per second (0.448%)
596591 Hz fixed frequency - 1.676 us precision, 596591 us of overhead per second (59.6%)

Finally, I should mention that for cases where the precision from the timer's IRQ alone isn't adequate it's possible to have a loop that polls the timer's count after the "nearest" IRQ has occured. For example, if you used 1000 Hz fixed frequency and want a 1.9 ms delay, then you could use the timer's IRQ to get a 1 ms delay and then spend 900 us polling the timer's count until the 1.9 ms delay has expired. This approach can get you 838 us precision from the PIT in all cases; where the more precision you get from the timer's IRQ the less time you waste in the polling the timer's count afterwards.

Cheers,

Brendan

OSDev.org

Best way to manage PIT

Best way to manage PIT

Re: Best way to manage PIT

Re: Best way to manage PIT

Re: Best way to manage PIT