True scheduler

Selenic · Post by **Selenic** » Sat Mar 20, 2010 6:27 am

@Brendan: The 'fair' scheduling you're talking about is a degenerate case of how I'd implement a scheduler myself - have multiple priority levels where a level is given time according to some policy, but within each level it is run in round-robin fashion. The policy could be 'only run a level if there is nothing above it'*, but there are other more complex ones that would probably work better.

* Which is potentially a Bad Idea because any non-IO-bound high-priority thread will break that system, which is why I'd implement the priorities given by the user or process as a guide, but lowering the priority of non-IO-bound threads (and increasing that of IO-bound and interactive threads) anyway.

wolfram · Post by **wolfram** » Sat Mar 20, 2010 3:24 pm

Love4Boobies wrote:as far as disks are concerned, I'd go for "amount written" instead of speed.

I guess it would be very useless way. How counting of "amount written" can help to provide fair schedulling? Nobody cares about how many a task has read/written from/to disk. How often task reads — that is important.

geomagas wrote:But not "What? You want the CPU? Well, let me see... You have already read 100G from disk and written 20G, so... no CPU for you!". What if the process changed behaviour and now wants to do an intensive calculation on the 100G it fetched from disk? And what if... the process waiting list is empty?

Of course it is absurd. My idea is about counting average reading/writing speed.

AVERAGE SPEED.I don't understand why majority has thought about size of read/wrote data.

aeritharcanum wrote:About your Windows assertion: I have never seen this information on processes' I/O usage while using Windows.It may or may not be there.

I'm more than sure you have never used ACLs in Windows. But it doesn't mean that they are not exist.
Just open Task Manager, open "Processes" tab, go to View menu, click "Choose columns" and choose what you want to see.(I've got russian version of Windows, that's why I could translate some names wrong).

aeritharcanum wrote:Regardless, since when has Windows been a paradigm for good design?

Sinse the win32 subsystem has been based on NT kernel, which based on VMS ideas.

Idea of I/O speed based scheduller can be more complex. Why not to enter concept of "guaranteed speed", for example?
We'r giving guaranteed I/O speed to some task on some disk. If the task don't write to the disk, all other tasks can use the disk freely. When the task want to write to the disk, disk dirver(or disk server if to speak about ukernels) looks to his queue. If there is another requests for writing, it looks, if the task has guaranteed speed and guarantees are satisfied. If not — disk driver doesn't pay attention on other requests and try to satisfy guarantees of the task. The other tasks haven't access to the disk, while guarantees will not be satisfied.
Of course the task can exceed its guaranteed speed, if there are not other requests in the queue.

aeritharcanum · Post by **aeritharcanum** » Sat Mar 20, 2010 5:22 pm

wolfram wrote:
aeritharcanum wrote:About your Windows assertion: I have never seen this information on processes' I/O usage while using Windows.It may or may not be there.
I'm more than sure you have never used ACLs in Windows. But it doesn't mean that they are not exist.
Just open Task Manager, open "Processes" tab, go to View menu, click "Choose columns" and choose what you want to see.(I've got russian version of Windows, that's why I could translate some names wrong).

aeritharcanum wrote:Regardless, since when has Windows been a paradigm for good design?
Sinse the win32 subsystem has been based on NT kernel, which based on VMS ideas.

That was random...

Selenic · Post by **Selenic** » Sun Mar 21, 2010 6:30 am

wolfram wrote:My idea is about counting average reading/writing speed.

AVERAGE SPEED.I don't understand why majority has thought about size of read/wrote data.

Because (bytes/sec)*sec = bytes?
In other words, because over a fixed time period, average and total are (by definition) proportional to each other. So anything that writes a lot of data will have a higher average *and total* than anything that doesn't and hasn't just been started.

wolfram wrote:Idea of I/O speed based scheduller can be more complex. Why not to enter concept of "guaranteed speed", for example?

I think the reason for this is that, without a real-time CPU scheduler as well, it's largely useless. For example, media servers will have that, but many of them run special real-time OSes which are designed to deal with the type of loads they handle.

Even then, I think a variation of what you said is more common: "I need this data before this deadline" (so you can use EDF on the disk as well as the processor), which is the guarantee which is actually needed by such programs.

geomagas · Post by **geomagas** » Mon Mar 22, 2010 3:17 am

Hi,

Brendan wrote:The main idea is that important things don't wait for unimportant things to complete

Right. I couldn't agree more. If I had the authority I would name this "The first law of scheduling".

However, the hard part is deciding what is important and what is not. Do you let the programmer (the maker) decide? The user (the one who executes it)? Or the system? (the pros/cons of each approach are obvious) And if the latter, based on what info? If we're talking about managed code, the system could have one or two (or several) hints of what a thread would do, before it is run. But in precompiled situations, the only way is monitoring the behaviour, maybe even keeping historical data and punishing/rewarding threads accordingly. Also, this applies to priorities on non-realtime systems (ie in absence of deadlines). In realtime applications, only the application designer can decide.

Selenic wrote:
wolfram wrote:My idea is about counting average reading/writing speed.

AVERAGE SPEED.I don't understand why majority has thought about size of read/wrote data.
Because (bytes/sec)*sec = bytes?

I was about to answer the same thing, both because average is proportional to total and because bytes is proportional to speed(given a constant-speed device). But then I thought that this wouldn't always be the case.

Consider a system with a filesystem that mounts several devices. Two threads constantly access the f/s with equal frequency, asking for the same amount of data each time. The only difference is that thread A constantly asks for the same file that resides on the same device over and over, while thread B asks for different data each time, that could reside on any of the devices. Because different devices have different access speeds, the two threads could have different average I/O speeds while having the same average (and total) bytes read. Moreover, if every device maintains a cache, thread A will potentially be doing much faster I/O even if the file it ts asking is on a very slow device.

So he might have a point there (distinguishing avreage speed from total bytes, i mean).

However, if one tried to apply Brendan's law to the above example, they just couldn't. Using either speed or amount of data, no conclusion of importance can be extracted and thus no scheduling decision (for the CPU) can be made.

So @wolfram...

wolfram wrote:
geomagas wrote:But not "What? You want the CPU? Well, let me see... You have already read 100G from disk and written 20G, so... no CPU for you!". What if the process changed behaviour and now wants to do an intensive calculation on the 100G it fetched from disk? And what if... the process waiting list is empty?
Of course it is absurd. My idea is about counting average reading/writing speed.

You surely understand that replacing amounts_of_data above with my_average_io_speed_metric doesn't make it less absurd.

Regards

Craze Frog · Post by **Craze Frog** » Wed Mar 24, 2010 1:01 pm

What I'm suggesting here is that "fair" is stupid

You're right. The whole point of schedulers it to not be fair. If the world was really fair we would run tasks to completion on a first-come-first-serve basis.

Brendan · Post by **Brendan** » Wed Mar 24, 2010 3:22 pm

Hi,

Craze Frog wrote:
What I'm suggesting here is that "fair" is stupid
You're right. The whole point of schedulers it to not be fair. If the world was really fair we would run tasks to completion on a first-come-first-serve basis.

For certain types of tasks (anything that doesn't involved networking or users?), running tasks until completion does make a lot more sense (less overhead and less "half finished" jobs, and a lot better "average time until job completion").

It's an idea I played with a while ago, that's still rolling around in the back of my mind.

Cheers,

Brendan

OSDev.org

True scheduler

Re: True scheduler

Re: True scheduler

Re: True scheduler

Re: True scheduler

Re: True scheduler

Re: True scheduler

Re: True scheduler