Theory on clocks in hardware and software
Posted: Mon Dec 26, 2022 6:45 am
Hello everyone, and belated merry Christmas.
I recently looked over list of software clocks implemented in Linux and wondered how they buggered up the concept of "monotonic time" that badly. Looking at the clock_gettime manpage, I see no less than five clocks that ought to be the same as the monotonic clock, but aren't because the abstractions keep leaking.
The objective of a program looking at the clock has always been one of two things: Figuring out how much time has passed since the last time it looked at the clock, or figuring out what time to display to the user. Therefore it simply does not make sense to me to have a monotonic clock that doesn't count the time the system was suspended. My OS will therefore only have two clocks for telling the time, a monotonic and a realtime clock. This is the software interface presented to the programs.
My OS kernel will have a class of device drivers called hardware clocks. These drivers will know the frequency of the clock, the precision of the clock, some quirks of the clock, and of course a method to read the clock. That method returns a non-overflowing 64-bit counter (non-overflowing, of course, because a 64-bit counter starting at zero and running at a frequency of less than ca. 10 GHz will not overflow in any sensible human time frame). I would have to look into sensible implementations of that driver interface on the PC, but the TSC, the HPET, and the RTC all seem like they could serve as backend. If all else fails, the driver can count timer interrupts, but I really don't want to make that the norm.
At boot time the kernel will then select the best available hardware timer (probably the one with the best precision) and present as monotonic time the value of that timer after passing through a linear function. Since the output is supposed to be the time in nanoseconds, the initial factor will be the period time of the timer (the reciprocal of its frequency) in nanoseconds. And the initial offset will be zero.
An NTP client might want to adjust the tick rate. I consider the frequency of the hardware timer to be fixed, but what can be done is to change the factor of the linear function. However, if the factor is lowered, this might cause the value seen by userspace to lower as well. This must not happen; two reads of the monotonic clock that are ordered by some other means must have increasing values. Therefore, when the factor is changed, I will read the current value of the monotonic clock and the hardware timer into x0 and y0, and set the function henceforth to:
Oh, look at that. The new function is also a linear function with the new period time as factor and y0 - T' x0 as offset.
As long as T' is positive, the function will always increase this way. I may implement a T floor, so that it cannot be set nonsensically low; we'll see.
The real time clock presented to userspace is simply a constant offset from the monotonic clock. Setting the real time clock means setting a new offset from the monotonic clock. Leap seconds I really don't want to bother with too much, so the NTP client can simply set back the real time clock by one second at some point during the leap second. This is the most any application should ever need.
As for interrupting timers: I thought I would simply have a priority queue of all the things that need doing at some point. These timers are ordered by deadline. Whenever something changes about the priority queue, the timer interrupt is set for the next deadline, then all timers with expired deadlines are run and removed from the queue. This includes things like scheduling the next task. I would simply register a timer to do that when (re-)starting a user task while other user tasks are available for the CPU (otherwise no time limit is needed, right?)
Oh yeah, suspend mode. When suspending the machine, if the hardware timer does not run in suspend mode, then I would look for a hardware timer that remains running while suspended. For that, the precision doesn't matter. If no such timer exists, then obviously I cannot gauge the time spent suspended, but if I have such a timer, then I can, and I can increase the offset on the monotonic clock function by the time spent in suspend mode.
OK, did I forget anything?
I recently looked over list of software clocks implemented in Linux and wondered how they buggered up the concept of "monotonic time" that badly. Looking at the clock_gettime manpage, I see no less than five clocks that ought to be the same as the monotonic clock, but aren't because the abstractions keep leaking.
The objective of a program looking at the clock has always been one of two things: Figuring out how much time has passed since the last time it looked at the clock, or figuring out what time to display to the user. Therefore it simply does not make sense to me to have a monotonic clock that doesn't count the time the system was suspended. My OS will therefore only have two clocks for telling the time, a monotonic and a realtime clock. This is the software interface presented to the programs.
My OS kernel will have a class of device drivers called hardware clocks. These drivers will know the frequency of the clock, the precision of the clock, some quirks of the clock, and of course a method to read the clock. That method returns a non-overflowing 64-bit counter (non-overflowing, of course, because a 64-bit counter starting at zero and running at a frequency of less than ca. 10 GHz will not overflow in any sensible human time frame). I would have to look into sensible implementations of that driver interface on the PC, but the TSC, the HPET, and the RTC all seem like they could serve as backend. If all else fails, the driver can count timer interrupts, but I really don't want to make that the norm.
At boot time the kernel will then select the best available hardware timer (probably the one with the best precision) and present as monotonic time the value of that timer after passing through a linear function. Since the output is supposed to be the time in nanoseconds, the initial factor will be the period time of the timer (the reciprocal of its frequency) in nanoseconds. And the initial offset will be zero.
An NTP client might want to adjust the tick rate. I consider the frequency of the hardware timer to be fixed, but what can be done is to change the factor of the linear function. However, if the factor is lowered, this might cause the value seen by userspace to lower as well. This must not happen; two reads of the monotonic clock that are ordered by some other means must have increasing values. Therefore, when the factor is changed, I will read the current value of the monotonic clock and the hardware timer into x0 and y0, and set the function henceforth to:
Code: Select all
y = T' (x - x0) + y0 = T' x - T' x0 + y0
As long as T' is positive, the function will always increase this way. I may implement a T floor, so that it cannot be set nonsensically low; we'll see.
The real time clock presented to userspace is simply a constant offset from the monotonic clock. Setting the real time clock means setting a new offset from the monotonic clock. Leap seconds I really don't want to bother with too much, so the NTP client can simply set back the real time clock by one second at some point during the leap second. This is the most any application should ever need.
As for interrupting timers: I thought I would simply have a priority queue of all the things that need doing at some point. These timers are ordered by deadline. Whenever something changes about the priority queue, the timer interrupt is set for the next deadline, then all timers with expired deadlines are run and removed from the queue. This includes things like scheduling the next task. I would simply register a timer to do that when (re-)starting a user task while other user tasks are available for the CPU (otherwise no time limit is needed, right?)
Oh yeah, suspend mode. When suspending the machine, if the hardware timer does not run in suspend mode, then I would look for a hardware timer that remains running while suspended. For that, the precision doesn't matter. If no such timer exists, then obviously I cannot gauge the time spent suspended, but if I have such a timer, then I can, and I can increase the offset on the monotonic clock function by the time spent in suspend mode.
OK, did I forget anything?