Processor P-states and power management

rdos · Post by **rdos** » Mon Jan 09, 2012 7:47 am

Brendan wrote:If they want to run something like this in the background, can they do it without your OS taking them to P0 permanently?

It won't and shouldn't run on RDOS anyway, so whats the big deal? I don't want a Windoze or Linux clone that runs lots of things I've not told it to run. I want the machine to run exactly what I tell it to run, nothing else.

Brendan wrote:
rdos wrote:I don't predict load, I meassure it.
Wow - I wish I could predict the load of unknown processes at unknown times on unknown hardware.

As I wrote, I meassure load and adjust P-state after current load, and therefore there is no prediction involved.

Brendan wrote:
rdos wrote:No, because you could do the same job at the same performance with lower power, less temperature and noise just be selecting optimal P-states. If you want longer battery-life or less noise you could just tweak the parameters and select lower P-states that doesn't generate the same performance. It's not a major redesign, just some parameter changes. However, the default would be to keep performance at the lowest possible power.
You can't do the same job at the same performance with lower power. You can only make a compromise between performance and power. For high priority tasks you want high performance, for low priority tasks you want lower power. Surely you can see that for medium priority tasks you want something in between?

Wrong. Because the relationship between frequency / performance and power consumption is not linear, I can do just that.

Example (from AMD Athlon):
P0 runs at 3000MHz, and consumes 125W
P5 runs at 2000MHz, and consumes 60W

If I have 30% load at P0, that corresponds to 3000/2000 * 30% = 45% load at P5. No lets say that when the system is idle, it consumes negligable power. In the P0 state, you consume 125W during 30% of the time = 37.5W. In P5, you consume 60W during 45% of the time = 27W. Since the clocks are at a higher frequency when the processor is idle, and the idle time is longer at P0, it will consume more power in P0 also during idle time, so the 10.5W difference is a minimum value.

Brendan wrote:Users don't need to select priorities (although it'd be nice if they could if/when they want to). Software should tell the scheduler what it wants. A thread that's responsible for updating the user interface should be relatively high priority, a thread that does spell checking while the user types could be medium priority, a thread that regenerates search indexes could be low priority. Whoever wrote the code can use reasonable defaults.

Realistically, everyone wants their code to perform well, so you will probably see an inflation in priorities.

Solar · Post by **Solar** » Mon Jan 09, 2012 8:46 am

rdos wrote:Realistically, everyone wants their code to perform well, so you will probably see an inflation in priorities.

Have you been out of your office in the last thirty years or so?

Virtually every system out there allows setting the process priority. Some (like Linux) only allow decreasing the priority. Others (like AmigaOS) also allow increasing the priority.(*) Generally speaking, evolution takes place. People writing calculation-intensive apps that don't reduce their priority and bog down the system into unresponsiveness will find that people rather use other apps. Tasks with lots of user-interaction that reduce their priority so they feel sluggish will find that people rather use other apps.

It works, it works well, and it worked back in the 70'ies already.

You were the one calling for applications to behave cooperatively. But you don't believe in them handling priorities responsibly?

Your statements of system policy are so full of self-contradiction that it's hard to take them serious. You give the impression of someone who makes it up as he goes along, with no thought for concept, design, or future.

(*): Yes, you could arbitrarily set your task's priority. Few people ever made the mistake of using a priority >5 twice: The GUI ran at that priority, and going beyond that value meant you more or less locked down your system. Virtually every app around did settle for the default prio (0), and some (like rc5client or seti@home) went for -127. -128 was the idle thread. No inflation to be seen.

rdos · Post by **rdos** » Mon Jan 09, 2012 9:23 am

I know very well about priorities in GUIs like Windows, and that this has been around for a while. However, I'm not building a Windows-clone GUI, a Linux-clone GUI or anything like that so I don't see the relevance. I don't expect RDOS to ever run a standard GUI for a desktop PC. I expect it to run a few predefined "embedded" programs that are well-behaved, that use a micro-type GUI which isn't message-driven. It will definitely not run SETI, spy-programs, third-party plugins or anything like that. There is no sense in priorities in this setup. If something needs "realtime" performance, just start a thread with above normal priority. Above-normal priority in my system is not for better performing GUIs, it is for time-critical code.

Solar · Post by **Solar** » Mon Jan 09, 2012 9:39 am

rdos wrote:...all applications have the same priority.

rdos wrote:If something needs "realtime" performance, just start a thread with above normal priority.

rdos wrote:There is no sense in priorities in this setup.

Self-contradiction q.e.d.

I know very well about priorities in GUIs like Windows, and that this has been around for a while.

Priorities have been around since punch-card times, i.e. they predate even text terminals. I mentioned anything GUI related simply for the sake of example.

rdos · Post by **rdos** » Mon Jan 09, 2012 10:04 am

Solar wrote:Priorities have been around since punch-card times, i.e. they predate even text terminals. I mentioned anything GUI related simply for the sake of example.

I HAVE priorities as the one's you had at "punch-card times", but I just don't see the point in applications being able to manipulate those. Windows has special "GUI priorities" in order for the GUI to perform well, which I see no reason to support since I have no Windows-like GUI. As I already indicated several times before in this thread, I only use priorities for time-critical code, not to boost user experience in a GUI, or to allow SETI to run in the background.

Brendan · Post by **Brendan** » Mon Jan 09, 2012 10:46 am

Hi,

rdos wrote:
Brendan wrote:If they want to run something like this in the background, can they do it without your OS taking them to P0 permanently?
It won't and shouldn't run on RDOS anyway, so whats the big deal? I don't want a Windoze or Linux clone that runs lots of things I've not told it to run. I want the machine to run exactly what I tell it to run, nothing else.

Oh, sorry - I forgot that your OS is only ever going to be useful for running ATM code that you write, due to too many excuses for intentionally poor design.

rdos wrote:
Brendan wrote:
rdos wrote:I don't predict load, I meassure it.
Wow - I wish I could predict the load of unknown processes at unknown times on unknown hardware.
As I wrote, I meassure load and adjust P-state after current load, and therefore there is no prediction involved.

Just to clarify, you're telling me that you can accurately measure the load of processes that haven't been written yet, even when the OS is running on CPUs that haven't been invented? You're telling me that you've already done all these measurements and discovered that none of the processes that will ever run on RDOS involve one or more medium priority tasks that run for a reasonably lengthy amount of time? If this is the case, I'd strongly recommend patenting whatever method you use to take measurements from the future.

rdos wrote:
Brendan wrote:
rdos wrote:No, because you could do the same job at the same performance with lower power, less temperature and noise just be selecting optimal P-states. If you want longer battery-life or less noise you could just tweak the parameters and select lower P-states that doesn't generate the same performance. It's not a major redesign, just some parameter changes. However, the default would be to keep performance at the lowest possible power.
You can't do the same job at the same performance with lower power. You can only make a compromise between performance and power. For high priority tasks you want high performance, for low priority tasks you want lower power. Surely you can see that for medium priority tasks you want something in between?
Wrong. Because the relationship between frequency / performance and power consumption is not linear, I can do just that.

Example (from AMD Athlon):
P0 runs at 3000MHz, and consumes 125W
P5 runs at 2000MHz, and consumes 60W

If I have 30% load at P0, that corresponds to 3000/2000 * 30% = 45% load at P5. No lets say that when the system is idle, it consumes negligable power. In the P0 state, you consume 125W during 30% of the time = 37.5W. In P5, you consume 60W during 45% of the time = 27W. Since the clocks are at a higher frequency when the processor is idle, and the idle time is longer at P0, it will consume more power in P0 also during idle time, so the 10.5W difference is a minimum value.

For the way you measure load, if you have 30% load at P0 then you've got one or more tasks that are constantly blocking/unblocking (e.g. IO bound tasks). Let's assume it's some sort of network service - it blocks until a packet arrives, processes the packet, then sends a reply packet and blocks again. Let's say it's handling 1000 packets per second and (at P0) takes 300 us to handle each packet. Call that "300 us of latency". You switch to P5, and it's still handling 1000 packets per second, but now it takes 450 us to handle each packet. Call that "50% higher latency". Is the performance the same?

rdos wrote:
Brendan wrote:Users don't need to select priorities (although it'd be nice if they could if/when they want to). Software should tell the scheduler what it wants. A thread that's responsible for updating the user interface should be relatively high priority, a thread that does spell checking while the user types could be medium priority, a thread that regenerates search indexes could be low priority. Whoever wrote the code can use reasonable defaults.
Realistically, everyone wants their code to perform well, so you will probably see an inflation in priorities.

Realistically, most people don't want less important work screwing up their interactive threads. For a simple example, you might have a word processor with a medium priority thread doing spell checking, a low priority thread that saves an automatic backup every 5 seconds, and a higher priority thread for the user interface. When the user presses a key you want to switch to the higher priority user interface thread immediately - you don't want to wait for both the automatic backup thread and the spell checker to finish their time slice before the user interface thread gets any CPU time.

For a test, you should be able to have 2000 low priority tasks all doing heavy calculations (or just wasting CPU time in a loop if you like - sooner or later everyone writes a "dummy load" process), and the user shouldn't be able to notice because the GUI and all the applications should be just as fast and responsive as they are when there's nothing else going on. For a bad system (e.g. round robin scheduler with no task priorities) 2000 tasks will cripple the entire OS.

To prevent "priority inflation", I give each process a maximum limit. When one process starts another process, it sets the new process' maximum limit to anything equal to or lower than its own limit. The OS might have a "virtual screen" process that is limited to "max. priority = 0x30", which starts a GUI process that is limited to "max. priority = 0x40", which starts a text editor which is limited to "max. priority = 0x48". The text editor can spawn a low priority thread (e.g. with "priority = 0xC0") or a relatively high priority thread (e.g. "priority = 0x50"); but if the text editor tries to spawn a very high priority thread (e.g. "priority = 0x20") then the kernel limits the new thread's priority and the thread ends up being "priority = max. priority for process = 0x50". Of course something like this won't work well if the OS only has 3 priorities (high, medium or low) - you'd run out of priorities and end up with all your applications running as "low priority".

Note: It doesn't matter much what the threads are. Each process always has some sort of "user" (even if the "user" is another process or another computer on the network, and not a human), and as soon as you start looking at multi-threaded processes (necessary for processes that take advantage of multi-CPU) you start wanting to use different priorities for threads that do different work in the same process.

Cheers,

Brendan

rdos · Post by **rdos** » Mon Jan 09, 2012 1:12 pm

Brendan wrote:Just to clarify, you're telling me that you can accurately measure the load of processes that haven't been written yet, even when the OS is running on CPUs that haven't been invented? You're telling me that you've already done all these measurements and discovered that none of the processes that will ever run on RDOS involve one or more medium priority tasks that run for a reasonably lengthy amount of time? If this is the case, I'd strongly recommend patenting whatever method you use to take measurements from the future.

Meassure as in meassure in realtime. I don't calculate anything in advance, I meassure load as 100 - tics in null thread / total tics for core. I do this between 4 and 10 times (not yet decided) per second, and adjust P-state one step up or down based on load. As we have already discussed a page back or so.

Brendan wrote:For the way you measure load, if you have 30% load at P0 then you've got one or more tasks that are constantly blocking/unblocking (e.g. IO bound tasks). Let's assume it's some sort of network service - it blocks until a packet arrives, processes the packet, then sends a reply packet and blocks again. Let's say it's handling 1000 packets per second and (at P0) takes 300 us to handle each packet. Call that "300 us of latency". You switch to P5, and it's still handling 1000 packets per second, but now it takes 450 us to handle each packet. Call that "50% higher latency". Is the performance the same?

The network server thread is an excellent example. As long as the network server thread can serve the packets at the rate they are coming in, there is no problem. Another example could be the animation thread. It draws a picture and pauses until a predefined time. As long as it doesn't take longer to draw the picture than the time the whole process (drawing + wait) should take, it performs as it should. A similar scenario is a thread that loads audio data to a speaker. As long as it can provide the data in real time, it is ok. When these threads start to become overloaded (and thus malfunction), processor load starts increasing towards 100%. That's why the policy is to keep load around 50%, so you have some room for sudden, extra load.

Brendan wrote:Realistically, most people don't want less important work screwing up their interactive threads. For a simple example, you might have a word processor with a medium priority thread doing spell checking, a low priority thread that saves an automatic backup every 5 seconds, and a higher priority thread for the user interface. When the user presses a key you want to switch to the higher priority user interface thread immediately - you don't want to wait for both the automatic backup thread and the spell checker to finish their time slice before the user interface thread gets any CPU time.

That's only an issue with typical time-slice length of operating systems like Windows. You shouldn't need to fiddle with priority in order to get threads to run as they should. I have several test-programs that I use that take up 100% of the processor time, and if I start one or two of these in the backgrund, it is not noticable in an interactive program other than the processor seemingly running at a lower performance level than it should. If you need to change priorities dynamically in order to achieve this, you have chosen the wrong time-slice lenght.

Besides, it is perfectly possible for the guy writing the word processor to stop the spell checker and / or backup thread if user-experience gets better if these don't run at full speed as the GUI is redrawn. Changing their priorities is just the lazy way out.

Brendan wrote:For a test, you should be able to have 2000 low priority tasks all doing heavy calculations (or just wasting CPU time in a loop if you like - sooner or later everyone writes a "dummy load" process), and the user shouldn't be able to notice because the GUI and all the applications should be just as fast and responsive as they are when there's nothing else going on. For a bad system (e.g. round robin scheduler with no task priorities) 2000 tasks will cripple the entire OS.

I don't support creating 2000 tasks.

Brendan · Post by **Brendan** » Mon Jan 09, 2012 8:08 pm

Hi,

rdos wrote:
Brendan wrote:For the way you measure load, if you have 30% load at P0 then you've got one or more tasks that are constantly blocking/unblocking (e.g. IO bound tasks). Let's assume it's some sort of network service - it blocks until a packet arrives, processes the packet, then sends a reply packet and blocks again. Let's say it's handling 1000 packets per second and (at P0) takes 300 us to handle each packet. Call that "300 us of latency". You switch to P5, and it's still handling 1000 packets per second, but now it takes 450 us to handle each packet. Call that "50% higher latency". Is the performance the same?
The network server thread is an excellent example. As long as the network server thread can serve the packets at the rate they are coming in, there is no problem.

That depends on the protocol. For protocols like HTTP, TFTP, Telnet, etc higher latency means slower.

In general, "performance" means 2 things - amount of work done in a given length of time (throughput) and the amount of time to get a response (latency). Throughput is more important than latency for non-interactive loads (e.g. a compiler), and latency is much more important than throughput for interactive loads (e.g. web server, user interfaces). You can increase latency by 50% and say that throughput is the same, or that the increase in latency is an acceptable compromise; but you can't increase latency by 50% and say that performance is the same.

rdos wrote:
Brendan wrote:Realistically, most people don't want less important work screwing up their interactive threads. For a simple example, you might have a word processor with a medium priority thread doing spell checking, a low priority thread that saves an automatic backup every 5 seconds, and a higher priority thread for the user interface. When the user presses a key you want to switch to the higher priority user interface thread immediately - you don't want to wait for both the automatic backup thread and the spell checker to finish their time slice before the user interface thread gets any CPU time.
That's only an issue with typical time-slice length of operating systems like Windows. You shouldn't need to fiddle with priority in order to get threads to run as they should. I have several test-programs that I use that take up 100% of the processor time, and if I start one or two of these in the backgrund, it is not noticable in an interactive program other than the processor seemingly running at a lower performance level than it should. If you need to change priorities dynamically in order to achieve this, you have chosen the wrong time-slice lenght.

If you start one or 2 of these dummy load tasks, it's "not noticeable" except "the processor seems to be running at a lower performance level than it should"? If you smash your testicles with a hammer, it won't hurt at all (except for the excruciating pain).

rdos wrote:Besides, it is perfectly possible for the guy writing the word processor to stop the spell checker and / or backup thread if user-experience gets better if these don't run at full speed as the GUI is redrawn. Changing their priorities is just the lazy way out.

Don't be silly. The guy writing the word processor decides to stop the spell checker thread to improve the GUI's speed, but doesn't know that the spell checker thread was running on a different CPU and doing no harm, or that the main problem with his GUI is a thread that belongs to a completely different process that is hogging all the CPU time.

Unless the scheduler knows the relative importance of different threads, it can't make sure that the most important work is being done. Nothing else can make sure that the most important work is being done (and even if they do a process shouldn't have to monitor the activity of every other process just to work around the OS's design flaws). The OS ends up doing less important work when there's more important work that it should be doing.

rdos wrote:
Brendan wrote:For a test, you should be able to have 2000 low priority tasks all doing heavy calculations (or just wasting CPU time in a loop if you like - sooner or later everyone writes a "dummy load" process), and the user shouldn't be able to notice because the GUI and all the applications should be just as fast and responsive as they are when there's nothing else going on. For a bad system (e.g. round robin scheduler with no task priorities) 2000 tasks will cripple the entire OS.
I don't support creating 2000 tasks.

Why? Is it another case of "not enough confidence to dare doing anything that might expand the range of uses the OS has"?

Cheers,

Brendan

rdos · Post by **rdos** » Tue Jan 10, 2012 2:12 am

Brendan wrote:That depends on the protocol. For protocols like HTTP, TFTP, Telnet, etc higher latency means slower.

Too some degree this can be true. It depends. If it is some automatic transfer, nobody cares as long as it completes in "reasonable time"

Brendan wrote:In general, "performance" means 2 things - amount of work done in a given length of time (throughput) and the amount of time to get a response (latency). Throughput is more important than latency for non-interactive loads (e.g. a compiler), and latency is much more important than throughput for interactive loads (e.g. web server, user interfaces). You can increase latency by 50% and say that throughput is the same, or that the increase in latency is an acceptable compromise; but you can't increase latency by 50% and say that performance is the same.

In general, peformance in a typical embedded systems means that things get done in the timeframe expected, and that the user perceives the system as responsive. Often, neither throughput nor latency means anything. As in the examples I gave earlier with animations and sound. The only thing that means anything is if these things are complete in the expected timeframe. It is no advantage whatsoever if they complete before they are expected to complete, as this only adds idle-time to the processor. Many things in an embedded application / event-driven system are like this.

Combuster · Post by **Combuster** » Tue Jan 10, 2012 7:46 am

rdos wrote:the user perceives the system as responsive (...) latency doesn't mean anything

Can't you just read and GTFO for once? This forum can not use people like you that claim they are better yet make the most errors and blatantly refuse to accept criticism.

Worse, you're setting bad examples for all people who are not aware of the truth and we keep cleaning up after you. I expect that your posts are 90% noise for that very reason.

rdos · Post by **rdos** » Tue Jan 10, 2012 8:40 am

Truth? What truth? Do you really believe there is a truth about how to design an OS?

Solar · Post by **Solar** » Tue Jan 10, 2012 8:52 am

There is a truth in the usage of the term "performance"...

Combuster · Post by **Combuster** » Tue Jan 10, 2012 9:17 am

I do, and I have a university degree that says I have mastered design theory.

I suppose you're not a believer?

rdos · Post by **rdos** » Tue Jan 10, 2012 9:42 am

Combuster wrote:I do, and I have a university degree that says I have mastered design theory.

OS design theory? Then you should have learned that there is almost always more than one way to do everything. Or was the degree sponsored by Microsoft?

rdos · Post by **rdos** » Tue Jan 10, 2012 9:48 am

Solar wrote:There is a truth in the usage of the term "performance"...

There is? You don't suppose that performance can be meassured in another way than in instructions per second then?

OSDev.org

Processor P-states and power management

Re: Processor P-states and power management

Re: Processor P-states and power management

Re: Processor P-states and power management

Re: Processor P-states and power management

Re: Processor P-states and power management

Re: Processor P-states and power management

Re: Processor P-states and power management

Re: Processor P-states and power management

Re: Processor P-states and power management

Re: Processor P-states and power management

Re: Processor P-states and power management

Re: Processor P-states and power management

Re: Processor P-states and power management

Re: Processor P-states and power management

Re: Processor P-states and power management