True scheduler

Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
Selenic
Member
Member
Posts: 123
Joined: Sat Jan 23, 2010 2:56 pm

Re: True scheduler

Post by Selenic »

@Brendan: The 'fair' scheduling you're talking about is a degenerate case of how I'd implement a scheduler myself - have multiple priority levels where a level is given time according to some policy, but within each level it is run in round-robin fashion. The policy could be 'only run a level if there is nothing above it'*, but there are other more complex ones that would probably work better.

* Which is potentially a Bad Idea because any non-IO-bound high-priority thread will break that system, which is why I'd implement the priorities given by the user or process as a guide, but lowering the priority of non-IO-bound threads (and increasing that of IO-bound and interactive threads) anyway.
wolfram
Posts: 19
Joined: Wed Aug 19, 2009 11:37 am
Location: Saint-Petersburg, Russia

Re: True scheduler

Post by wolfram »

Love4Boobies wrote:as far as disks are concerned, I'd go for "amount written" instead of speed.
I guess it would be very useless way. How counting of "amount written" can help to provide fair schedulling? Nobody cares about how many a task has read/written from/to disk. How often task reads — that is important.
geomagas wrote:But not "What? You want the CPU? Well, let me see... You have already read 100G from disk and written 20G, so... no CPU for you!". What if the process changed behaviour and now wants to do an intensive calculation on the 100G it fetched from disk? And what if... the process waiting list is empty?
Of course it is absurd. My idea is about counting average reading/writing speed.

AVERAGE SPEED.I don't understand why majority has thought about size of read/wrote data.
aeritharcanum wrote:About your Windows assertion: I have never seen this information on processes' I/O usage while using Windows.It may or may not be there.
I'm more than sure you have never used ACLs in Windows. But it doesn't mean that they are not exist.
Just open Task Manager, open "Processes" tab, go to View menu, click "Choose columns" and choose what you want to see.(I've got russian version of Windows, that's why I could translate some names wrong).
aeritharcanum wrote:Regardless, since when has Windows been a paradigm for good design?
Sinse the win32 subsystem has been based on NT kernel, which based on VMS ideas.


Idea of I/O speed based scheduller can be more complex. Why not to enter concept of "guaranteed speed", for example?
We'r giving guaranteed I/O speed to some task on some disk. If the task don't write to the disk, all other tasks can use the disk freely. When the task want to write to the disk, disk dirver(or disk server if to speak about ukernels) looks to his queue. If there is another requests for writing, it looks, if the task has guaranteed speed and guarantees are satisfied. If not — disk driver doesn't pay attention on other requests and try to satisfy guarantees of the task. The other tasks haven't access to the disk, while guarantees will not be satisfied.
Of course the task can exceed its guaranteed speed, if there are not other requests in the queue.
aeritharcanum
Posts: 13
Joined: Sun Mar 07, 2010 3:17 pm

Re: True scheduler

Post by aeritharcanum »

wolfram wrote:
aeritharcanum wrote:About your Windows assertion: I have never seen this information on processes' I/O usage while using Windows.It may or may not be there.
I'm more than sure you have never used ACLs in Windows. But it doesn't mean that they are not exist.
Just open Task Manager, open "Processes" tab, go to View menu, click "Choose columns" and choose what you want to see.(I've got russian version of Windows, that's why I could translate some names wrong).
aeritharcanum wrote:Regardless, since when has Windows been a paradigm for good design?
Sinse the win32 subsystem has been based on NT kernel, which based on VMS ideas.
That was random...
Selenic
Member
Member
Posts: 123
Joined: Sat Jan 23, 2010 2:56 pm

Re: True scheduler

Post by Selenic »

wolfram wrote:My idea is about counting average reading/writing speed.

AVERAGE SPEED.I don't understand why majority has thought about size of read/wrote data.
Because (bytes/sec)*sec = bytes?
In other words, because over a fixed time period, average and total are (by definition) proportional to each other. So anything that writes a lot of data will have a higher average *and total* than anything that doesn't and hasn't just been started.
wolfram wrote:Idea of I/O speed based scheduller can be more complex. Why not to enter concept of "guaranteed speed", for example?
I think the reason for this is that, without a real-time CPU scheduler as well, it's largely useless. For example, media servers will have that, but many of them run special real-time OSes which are designed to deal with the type of loads they handle.

Even then, I think a variation of what you said is more common: "I need this data before this deadline" (so you can use EDF on the disk as well as the processor), which is the guarantee which is actually needed by such programs.
User avatar
geomagas
Posts: 12
Joined: Wed Feb 27, 2008 6:01 am
Location: Iraklion,Crete,Greece
Contact:

Re: True scheduler

Post by geomagas »

Hi,
Brendan wrote:The main idea is that important things don't wait for unimportant things to complete
Right. I couldn't agree more. If I had the authority I would name this "The first law of scheduling". :D

However, the hard part is deciding what is important and what is not. Do you let the programmer (the maker) decide? The user (the one who executes it)? Or the system? (the pros/cons of each approach are obvious) And if the latter, based on what info? If we're talking about managed code, the system could have one or two (or several) hints of what a thread would do, before it is run. But in precompiled situations, the only way is monitoring the behaviour, maybe even keeping historical data and punishing/rewarding threads accordingly. Also, this applies to priorities on non-realtime systems (ie in absence of deadlines). In realtime applications, only the application designer can decide.
Selenic wrote:
wolfram wrote:My idea is about counting average reading/writing speed.

AVERAGE SPEED.I don't understand why majority has thought about size of read/wrote data.
Because (bytes/sec)*sec = bytes?
I was about to answer the same thing, both because average is proportional to total and because bytes is proportional to speed(given a constant-speed device). But then I thought that this wouldn't always be the case.

Consider a system with a filesystem that mounts several devices. Two threads constantly access the f/s with equal frequency, asking for the same amount of data each time. The only difference is that thread A constantly asks for the same file that resides on the same device over and over, while thread B asks for different data each time, that could reside on any of the devices. Because different devices have different access speeds, the two threads could have different average I/O speeds while having the same average (and total) bytes read. Moreover, if every device maintains a cache, thread A will potentially be doing much faster I/O even if the file it ts asking is on a very slow device.

So he might have a point there (distinguishing avreage speed from total bytes, i mean).

However, if one tried to apply Brendan's law to the above example, they just couldn't. Using either speed or amount of data, no conclusion of importance can be extracted and thus no scheduling decision (for the CPU) can be made.

So @wolfram...
wolfram wrote:
geomagas wrote:But not "What? You want the CPU? Well, let me see... You have already read 100G from disk and written 20G, so... no CPU for you!". What if the process changed behaviour and now wants to do an intensive calculation on the 100G it fetched from disk? And what if... the process waiting list is empty?
Of course it is absurd. My idea is about counting average reading/writing speed.
You surely understand that replacing amounts_of_data above with my_average_io_speed_metric doesn't make it less absurd.

Regards
I was... delayed... - Gandalf
Craze Frog
Member
Member
Posts: 368
Joined: Sun Sep 23, 2007 4:52 am

Re: True scheduler

Post by Craze Frog »

What I'm suggesting here is that "fair" is stupid
You're right. The whole point of schedulers it to not be fair. If the world was really fair we would run tasks to completion on a first-come-first-serve basis.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: True scheduler

Post by Brendan »

Hi,
Craze Frog wrote:
What I'm suggesting here is that "fair" is stupid
You're right. The whole point of schedulers it to not be fair. If the world was really fair we would run tasks to completion on a first-come-first-serve basis.
For certain types of tasks (anything that doesn't involved networking or users?), running tasks until completion does make a lot more sense (less overhead and less "half finished" jobs, and a lot better "average time until job completion").

It's an idea I played with a while ago, that's still rolling around in the back of my mind. ;)


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Post Reply