Batch Processing?

Brendan · Post by **Brendan** » Wed Apr 18, 2007 4:24 pm

Hi,

I've been thinking about scheduling again, and had a strange thought....

If it takes you 1 hour to paint a fence, how long would it take to paint 3 similar fences?

If you do one fence at a time, then you'd finish the first fence after 1 hour, the second fence after 2 hours and the third fence after 3 hours. The average amount of time between knowing the fences need painting and completing a fence works out to 2 hours.

Imagine if you did 10% of the first fence, 10% of next fence, 10% of the third fence, then went back to the first fence and did the next 10% (and so on until all fences were painted). In this case the first fence would be done after 2 hours and 48 minutes, the second fence would be done after 2 hours and 54 minutes and the third fence would be done after 3 hours. The average amount of time between knowing the fences need painting and completing a fence works out to 2 hours and 54 minutes, but...

It'd actually take longer. Switching from one fence to another costs time (cleaning your brushes, travelling time, etc). If it costs 1 minute to switch between fences, then it'd take 29 minutes longer to complete all fences. The first fence would be done after 3 hours and 15 minutes, the second fence would be done after 3 hours and 22 minutes and the third fence would be done after 3 hours and 29 minutes. The average amount of time between knowing the fences need painting and completing a fence works out to 3 hours and 22 minutes - it's 82 minutes slower than doing the fences one at a time.

Then there's resources. If the painter does one fence at a time he can buy the paint for one fence, paint the fence, then get his money back from the customer before buying the paint for the next fence. If he was doing small parts of each fence in turn, then he'd need enough money to buy all the paint for all the fences - for 3 fences he'd need 3 times as much money to do it.

Doing 10% of each fence increases the average time before a fence is completed and uses more resources.

Modern schedulers pretend to run multiple threads at the same time by giving threads time slices and switching to a different thread if the entire time slice is used. This is perfect for some things, but (just like the painter doing 10% of each fence) this would increase the average time a task takes to complete and use more resources.

If you're compiling 10 large files, is it better to use 100 MB of memory and after 11 seconds have 10 half compiled files, or is it better to use 20 MB of memory and after 10 seconds have 5 compiled files and 5 files that haven't been started yet?

I'm thinking of having a special "batch processing" thread type, which tries to get jobs that take a finite amount of time completed in the least average time.

Basically (for threads of this type at the same priority) the scheduler would have a FIFO queue. The first thread on the queue that can run will run until it completes or blocks, and if a thread closer to the start of the queue unblocks it pre-empts other threads in the queue.

Any ideas or comments?

Cheers,

Brendan

Daedalus · Post by **Daedalus** » Wed Apr 18, 2007 5:27 pm

The difference with the painter is that his fences don't require much attention - it's fine to leave one for a moment and comes back.

But on a computer, if you just run one thread and wait until it finishes, every other thread is affected because they don't get to run.

What you're proposing sounds a lot like cooperative multitasking, where a thread runs until it's done, then turns control over to the next thread.

The problem with this (as seen in the days of Windows 3.1) is that a thread can decide it wants to do a lot and simply never give control back to the other threads.
This could also happen by accident - ie a program enters an infinite loop - what then? Under your system, it would simply continue to run without giving other threads any time to run.

That's why pre-emptive multitasking was invented.

Brendan · Post by **Brendan** » Wed Apr 18, 2007 6:09 pm

Hi,

Daedalus wrote:The difference with the painter is that his fences don't require much attention - it's fine to leave one for a moment and comes back.

But on a computer, if you just run one thread and wait until it finishes, every other thread is affected because they don't get to run.

Consider it like a normal/standard scheduler with the addition of a "batch processing" thread type, rather than a scheduler that has a "batch processing" thread type and nothing else.

Basically batch processing would be an option that threads could be used if appropriate.

The scheme I'm considering is to have several different scheduling policies, something like this:

Policy 0 - Co-operative scheduling for device drivers (e.g. IRQ handlers), where the highest priority task in this policy gets all CPU time (but where the scheduler monitors time used and terminates the thread if it uses more than N ms).
Policy 1a - High priority variable time slice for normal high priority threads, where all threads within this policy get N ms (where N depends on the threads priority) and threads take turns.
Policy 1b - High priority batch processing, where the oldest thread that can run is run and gets all CPU time (but where the scheduler monitors time used and terminates a thread if it uses a total of more than N ms).
Policy 2a - Normal priority variable time slice for normal threads, where all threads within this policy get N ms (where N depends on the threads priority) and threads take turns.
Policy 2b - Normal priority batch processing, where the oldest thread that can run is run and gets all CPU time (but where the scheduler monitors time used and terminates a thread if it uses a total of more than N ms).
Policy 3a - Low priority variable time slice, where all threads within this policy get N ms (where N depends on the threads priority) and threads take turns.
Policy 3b - Low priority batch processing, where the oldest thread that can run is run and gets all CPU time (and the scheduler doesn't monitor time used).

The scheduler would always try to run threads from the highest policy (i.e. if there's any threads in Policy 0 then threads in other policies won't get CPU time, and threads in Policy 3 won't get CPU time unless there are no threads in other policies).

Policy 0 would have 256 priorities. For other policies there'd be 255 priorities, and batch processing would only be done if there's no variable time slicing threads to run.

Time slice lengths would be different for different policies. For example in policy 1a the threads might get "(256 - priority) * 100 ns" of time (priority 0 threads would get 25.6 ms while priority 255 threads would get only get 0.1 ms). For policy 3a this time slices would be larger and the threads might get "(256 - priority) * 10 ms" of time (priority 0 threads would get 2560 ms while priority 255 threads would get only get 10 ms).

The same applies to batch processing threads. For policy 1b a thread might be terminated if it uses more than 50 ms, while for policy 2b a thread might be terminated if it uses more than 3 seconds.

Cheers,

Brendan

jnc100 · Post by **jnc100** » Thu Apr 19, 2007 12:58 pm

If you're compiling 10 large files, is it better to use 100 MB of memory and after 11 seconds have 10 half compiled files, or is it better to use 20 MB of memory and after 10 seconds have 5 compiled files and 5 files that haven't been started yet?

Good point.

Is it better (from the end user's point of view) for a web server to deliver half a web page to ten users or a whole web page to five users and nothing to another five?

You can't do a lot with half a web page.

Regards,
John.

Candy · Post by **Candy** » Thu Apr 19, 2007 1:19 pm

Isn't that just EDF with RR?

INF1n1t · Post by **INF1n1t** » Thu Apr 19, 2007 2:35 pm

Brendan wrote:Hi,

Daedalus wrote:The difference with the painter is that his fences don't require much attention - it's fine to leave one for a moment and comes back.

But on a computer, if you just run one thread and wait until it finishes, every other thread is affected because they don't get to run.
Consider it like a normal/standard scheduler with the addition of a "batch processing" thread type, rather than a scheduler that has a "batch processing" thread type and nothing else.

Basically batch processing would be an option that threads could be used if appropriate.

The scheme I'm considering is to have several different scheduling policies, something like this:

Policy 0 - Co-operative scheduling for device drivers (e.g. IRQ handlers), where the highest priority task in this policy gets all CPU time (but where the scheduler monitors time used and terminates the thread if it uses more than N ms).
Policy 1a - High priority variable time slice for normal high priority threads, where all threads within this policy get N ms (where N depends on the threads priority) and threads take turns.
Policy 1b - High priority batch processing, where the oldest thread that can run is run and gets all CPU time (but where the scheduler monitors time used and terminates a thread if it uses a total of more than N ms).
Policy 2a - Normal priority variable time slice for normal threads, where all threads within this policy get N ms (where N depends on the threads priority) and threads take turns.
Policy 2b - Normal priority batch processing, where the oldest thread that can run is run and gets all CPU time (but where the scheduler monitors time used and terminates a thread if it uses a total of more than N ms).
Policy 3a - Low priority variable time slice, where all threads within this policy get N ms (where N depends on the threads priority) and threads take turns.
Policy 3b - Low priority batch processing, where the oldest thread that can run is run and gets all CPU time (and the scheduler doesn't monitor time used).
The scheduler would always try to run threads from the highest policy (i.e. if there's any threads in Policy 0 then threads in other policies won't get CPU time, and threads in Policy 3 won't get CPU time unless there are no threads in other policies).

Policy 0 would have 256 priorities. For other policies there'd be 255 priorities, and batch processing would only be done if there's no variable time slicing threads to run.

Time slice lengths would be different for different policies. For example in policy 1a the threads might get "(256 - priority) * 100 ns" of time (priority 0 threads would get 25.6 ms while priority 255 threads would get only get 0.1 ms). For policy 3a this time slices would be larger and the threads might get "(256 - priority) * 10 ms" of time (priority 0 threads would get 2560 ms while priority 255 threads would get only get 10 ms).

The same applies to batch processing threads. For policy 1b a thread might be terminated if it uses more than 50 ms, while for policy 2b a thread might be terminated if it uses more than 3 seconds.

Cheers,

Brendan

From what I've understood you suggest that the current batch process is a like a real-time process (it doesn't get disturbed), but you don't leave it until it finishes, but until its time finishes.
The point is that the fence is not the perfect analogy for a process. The user doesn't care how the system works. He only cares, if his computer can finish his requests (in small times). When you give real-time on a task for an infinite time, the user suffers. When you give too small time slice - it is equal to the normal task switching. You should consider this, when implementing your idea.
And now about the fence -> For a second, so many things are happening into the system, the user can't think of. He only sees the general points, but not the details. I'm talking about a second. It is not like the fence, because the fence will be done after three hours. Wow! And the processor executes so many instructions in a second...you can't compare the times.
What I was trying to say is that I think the user will not see a difference. OR if he sees a difference, it would be when he's doing something complex.

I'm talking from the user point of view, because I don't have the knowledge and experience to talk from the osdev point of view. Sorry, if I didn't understand the topic ->

Combuster · Post by **Combuster** » Thu Apr 19, 2007 3:08 pm

Candy wrote:Isn't that just EDF with RR?

I can't really consider it a form of earliest deadline scheduling: that would imply scheduling to the shorter tasks. Instead they are processed FIFO.

Brendan wrote:batch processing, where the oldest thread that can run is run and gets all CPU time (but where the scheduler monitors time used and terminates a thread if it uses a total of more than N ms).

Regarding the concept, i find it creative. I do have some sidenotes though (i don't know enough of your design to know all of the contributing factors, actually I think you'd have considered them already

).
1: how do you plan to have threads identify themselves as batch or concurrent? Doing so in code would be best but requires changes when porting existing programs (binutils, anyone?); Doing so in the running program might result in the marking threads as batch that are going to exceed execution time or incorrect categorizing in general.
2: How do you intend to schedule single large operations that will be exceeding execution limits. (things like running computer vision algorithms or computing the poster rendering of the linux kernel), and as in #1, where do you want to regulate that?
3: What if... the painter brought his friends? Are you going to give each painter a fence or do you have them work together on one?

Brendan wrote:Policy 0 - Co-operative scheduling for device drivers (e.g. IRQ handlers), where the highest priority task in this policy gets all CPU time (but where the scheduler monitors time used and terminates the thread if it uses more than N ms).

I think some care should be taken here. Depending on the situation a crashed driver can cause serious trouble. Would a lock-up result in a panic, or do you have something more intelligent in mind?

B.E · Post by **B.E** » Thu Apr 19, 2007 3:09 pm

You will still have a problem of processes hogging resources as with full cooperative multi tasking. That is unless you restrict this scheme to trusted processes. This is because, If a program an change to this scheme programmicly, this could be used by trojans.

@INF1n1t. What brendan is saying is that it take time to switch between tasks and of course the end user is going to choose a system that is reliable (well sane one) and take as least time to acheive there task.

Colonel Kernel · Post by **Colonel Kernel** » Thu Apr 19, 2007 3:21 pm

I don't see what real-world problem this scheme is solving. Either a process is important at any given time, or it isn't. A background task that gets preempted by a higher priority task is no longer the most important thing going on in the system (this is what priority means, after all). If you pay some context switching overhead as a result, then so be it.

As for the compilation example, I think Combuster asked the right question -- what if the painter had friends? If you want a background task to run faster, find ways to parallelize it. We'll be drowning in cores in a few years.

mystran · Post by **mystran** » Thu Apr 19, 2007 4:36 pm

Note: when I say process below, I really mean a thread, since you might want several threads with different types of scheduling in the same program.

I think there's essentially four types of processes: real-time (audio, video, animations), interactive (like a spreadsheet or web-browser), service (http server, database), and batch (search cache update, virus scan, mp3 compression). Now I'd make it a goal to try to give each of these sane scheduling, while keeping the system responsive, and avoiding excessive manual configuration, since nobody wants to mess with priorities unless they have to.

I think the scheduling rules should mostly be: real-time always have priority, unless they are detected to take use all CPU in which case they can't possibly be real-time. You want smooth audio and video, and if an animation effects needs to wait some calculation that isn't directly visible on the screen, it makes it look like the system is overloaded.

With real-time stuff running smoothly, interactive should be given CPU burst after user interaction, and focused window should have some preference over the rest (since that's what user most likely cares the most about). This way the system would supposedly feel more responsive to the user. This obviously needs a feed-back path from GUI management to scheduler, though at least Linux attempts to detect interactive processes with heuristics instead.

Then there's services, which are probably best handled by using longer time-shares, since most of them are IO bound anyway, and if all the users are on the other side of a slow network anyway, bandwidth is more important than latency, at least most of the time. So the target here would be to attempt to keep all services in IO wait as much as possible. I guess these can be statically detected, either by having support for "services" like in Windows, or just by having them tell "hey, I'm a server, gimme service-scheduling." Heuristics like "doesn't have windows on screen, and listens to network connections" would also probably work..

Batch jobs you could handle by having just a few of them compete for CPU time at the same time, and they should probably only run when everything else is idle (or at least get only a little CPU if we're running at 100%). I think this is kinda like services: either the process or the user should tell that it's a batch job. These could have a priority system completely separate from scheduling priorities: just select in which order they should be started. I'm also going to claim that if you want to give a batch process more priority than an interactive (or even service) then it's probably not a true batch process.

Now the most interesting problem I find is how to automatically figure out what processes need real-time scheduling, and what processes would prefer interactive scheduling. You could have a "hard" real-time class which is set by a user (or the process) explicitly, but you might want to detect stuff like animations and audio automatically, and schedule those as "soft" (as in, prefer the explicit "hard" realtimes over "soft, but "soft" over "interactive") but otherwise using real-time rules, while giving interactives varying timeshares under the "soft" class. If a "soft" class process attempts to use 100% CPU, it's doing more than the real-time stuff and should be demoted to interactive, ofcourse.

For autodetecting I can think of the following scheme for dealing with multi-media: provide a feedback-path from audio/video drivers to scheduler. Then if a program opens audio-driver, give it "soft" class, unless it seems like it's going to miss audio buffers all the time anyway. With video it's more complicated but detecting heavy amounts of periodic redraw would give a good indication of animation, which should be given "soft" class as well. Ofcourse this kind of categorization must be done on the fly, such that a program drops back to normal interactive after it stops doing animation.

The idea with interactives then is that if they use more than a certain burst of CPU after user interactive, they should gradually drop back to equality with other interactives, since they aren't being responsive anyway.

That's my thoughts anyway...

mystran · Post by **mystran** » Thu Apr 19, 2007 4:53 pm

One more thought: try compiling (on an uniprocessor) some large project (which has clean file dependencies) with make -j 2 (or even larger) and measure the time (with say, time command) it takes, then compare with normal (single make) build.

Especially if you artificially kill disk-caches (so files really need to be read from disk) you'll notice that running several makes is going to be faster. Reason is simple: when one process is waiting for IO, the other can compile and then the other way around. So you definitely want some amount of multi-processing for any batch job that is even partially IO bound.. and I'd say that's almost anything except pure mathematics (ray-tracing, calculating hexadecimals of pi, etc).

But yeah, might not make sense to launch 100 batch processes at once. I don't think you need dynamic priorities though. Just limit the number of processes (or start a new one each time you find yourself in idle too much). Then run them with more or less equal priorities, but order the queue from which they are started. As for CPU pre-emption, I'd say that longish timeshares (say second or two) make sense, at least if you pre-empt purely CPU bound processes when an IO bound process comes out of IO...

Kevin McGuire · Post by **Kevin McGuire** » Thu Apr 19, 2007 5:41 pm

Take a look at schedtool for the ck branch of the 2.6 Linux kernels. Also when you compile a Linux kernel you get to choose between different timing mechanisms which are supposed to produce:

1. more latency; more throughput ; server ; less thread switching
2. less latency; less throughput ; desktop ; more thread switching

I agree with you.

Basically (for threads of this type at the same priority) the scheduler would have a FIFO queue. The first thread on the queue that can run will run until it completes or blocks, and if a thread closer to the start of the queue unblocks it pre-empts other threads in the queue.

Yep. I see it right now in the schedtool for the ck branch. It must be a working idea. Not trying to spoil you're fun but you're actual idea might still be different.

Brendan · Post by **Brendan** » Thu Apr 19, 2007 9:18 pm

Hi,

Combuster wrote:
Brendan wrote:batch processing, where the oldest thread that can run is run and gets all CPU time (but where the scheduler monitors time used and terminates a thread if it uses a total of more than N ms).
Regarding the concept, i find it creative. I do have some sidenotes though (i don't know enough of your design to know all of the contributing factors, actually I think you'd have considered them already ).
1: how do you plan to have threads identify themselves as batch or concurrent? Doing so in code would be best but requires changes when porting existing programs (binutils, anyone?); Doing so in the running program might result in the marking threads as batch that are going to exceed execution time or incorrect categorizing in general.

For my system the scheduling policy and priority is explicitly set by code that spawns a thread (and the code that creates a new process, for the first thread of a process). A thread can also change it's policy and priority at any time. Each process has a maximum policy/priority that it's threads are allowed to use. For example, the device driver for a network card might be able to use the highest policy/priority scheduling (Policy 0) while an application would be limited so that it can't.

Combuster wrote:2: How do you intend to schedule single large operations that will be exceeding execution limits. (things like running computer vision algorithms or computing the poster rendering of the linux kernel), and as in #1, where do you want to regulate that?

Threads that do a large amount of processing but complete in a finite amount of CPU time would use the lowest batch processing policy (e.g. Policy 3b).

Threads that do a large amount of processing but don't complete in a finite amount of CPU time would use the lowest normal processing policy (e.g. Policy 3a). Batch processing isn't suitable for something without a definite end (like computer vision algorithms, if they repeatedly process images until the user does something to stop them).

Combuster wrote:3: What if... the painter brought his friends? Are you going to give each painter a fence or do you have them work together on one?

It's impossible to get more than one painter to paint the same fence at the same time - they wouldn't know which painter is responsible for which part of the fence, and the resulting conflict can seriously damage the fence. In this case the fence builder would need to build 3 smaller fences instead of one large fence. Unfortunately for some types of fences there's practical difficulties that prevent builders from splitting a large fence into several smaller fences (typically where each piece of the fence relies on previous pieces).

Or to put it another way, it's impossible to get more than one CPU to run the same thread at the same time - programmers would need to split the work into several seperate threads (if they can).

Combuster wrote:
Brendan wrote:Policy 0 - Co-operative scheduling for device drivers (e.g. IRQ handlers), where the highest priority task in this policy gets all CPU time (but where the scheduler monitors time used and terminates the thread if it uses more than N ms).
I think some care should be taken here. Depending on the situation a crashed driver can cause serious trouble. Would a lock-up result in a panic, or do you have something more intelligent in mind?

For device drivers and system code I'll (hopefully) have built-in recovery systems, notifications for administrators, and (optional) automated emails containing diagnostic informatin sent to developers. Fault recovery systems in micro-kernels is possibly beyond the scope of this forum topic though...

Colonel Kernel wrote:I don't see what real-world problem this scheme is solving. Either a process is important at any given time, or it isn't. A background task that gets preempted by a higher priority task is no longer the most important thing going on in the system (this is what priority means, after all). If you pay some context switching overhead as a result, then so be it.

The problem is if you need to do several things that are equally important, where the average time something takes to be done is what matters.

Consider an application that displays an icon and the words "Hello World". The application sends a request to a font renderer to get the unicode string "Hello word" converted to a bitmap, and sends another request to an icon renderer to get the raw icon data converted into a bitmap (where the size of both bitmaps depends on the screen resolution or window size that the application happens to be using). The application can't completely draw it's screen until it gets the result from both renderers (services).

Is it better to have one of these things completed in 100 ms and the other completed after 200 ms, or is it better to have both things completed after 250 ms?

What if the application updated it's display when the results are received? It could display a blank screen with 2 grey boxes (one where the icon would go and the other where the text would go). Then if the icon completes first it could display the icon and one grey box where the text should be, and when the text finally arrives it could draw everything.

For a single CPU system with batch processing you might see a blank screen for 100 ms, then a partially drawn screen for 100 ms until you get the fully drawn screen (after 200 ms). Without batch processing you might see a blank screen for 225 ms, then a partially drawn screen for 25 ms until you get the fully drawn screen (after 250 ms).

Of course it's not that simple.

On a single CPU system with 2 seperate users who both try to use this application at the same time, one user might see a partially drawn screen after 100 ms and a completed screen after 300 ms; and the other user might see a partially drawn screen after 200 ms and a completed screen after 400 ms. Without batch processing one user might see a partially drawn screen after 200 ms and a completed screen after 225 ms; and the other user might see a partially drawn screen after 525 ms and a completed screen after 550 ms. In this case, with batch processing users see a partial screen after an average of 150ms (instead of 362 ms) and a completed screen after an average of 350 ms (instead of 387.5 ms).

On a multi-CPU system the font renderer and icon renderer might be running on different CPUs and you might see a fully drawn screen after 100 ms and batch processing won't make a difference.

Of course it's still not that simple.

You might have many more services instead of just 2. Requests might be started at different times (so they're overlapping). The user might have virtual desktops where several applications all try to update their screens at the same time when the user switches between virtual desktops. On a multi-CPU system other CPUs might be doing more important things.

Despite all of the things that complicate this in practice, the same general principle still applies - "batch processing" improves the average time until completion for services at the same priority.

This is probably more important for my OS, where everything is implemented as asychronous services.

You could assign priorities to each service and do all higher priority work before doing lower priority work (e.g. using a prioritized co-operative scheduler). If you ignore the problem of deciding which services are more important, it still won't work as well as "batch processing" would.

For the example above, if we assume the font renderer is more important than the icon renderer, what happens if the CPU is almost finished rendering an icon for Application A when the font renderer receives a request to convert a large amount of text into a bitmap for Application B? Even though the icon is almost ready to use, the icon renderer gets preempted by the font renderer and Application A needs to wait.

Using the fence painter anology again, if you were a customer getting your fence painted blue and the painter was almost finished your fence, how would you feel if the painter left because someone wanted a fence painted green, and came back to finish your fence 3 hours later? Would it help if the painter told you that green fences are higher priority than blue fences, and that you should consider yourself lucky because some other customer with a red fence has been waiting for 6 hours for the last 5 minutes work to be done on thier fence?

Cheers,

Brendan

Combuster · Post by **Combuster** » Fri Apr 20, 2007 2:34 am

It's impossible to get more than one painter to paint the same fence at the same time - they wouldn't know which painter is responsible for which part of the fence, and the resulting conflict can seriously damage the fence. In this case the fence builder would need to build 3 smaller fences instead of one large fence. Unfortunately for some types of fences there's practical difficulties that prevent builders from splitting a large fence into several smaller fences (typically where each piece of the fence relies on previous pieces).

Not all pairs of painters are called "Bush" and "Kerry"

the point of having two painters is that they provide an extra pair of hands: the majority of the time they will both be painting, but occasionally the second painter will go collect and prepare a new can of paint while the first painter can continue painting. the first painter will be able to work at peak efficiency as it is always doing the same, while the second painter can get more accustomed to getting up and refilling.
The key to performance improvement is that because there is more paint used in the same time, larger cans can be used without the risk of drying out in the californian sun, reducing the time spent filling.

Consider hyperthreading: running one thread is poor since a lot of execution resources will go unused, running two different processes will cause the caches to be separated over the two processes, slightly impairing both processes (but potentially faster as more resources are actually used), while having two painters on one fence would be optimal: the fence (CPU cache) is shared while the work (register contents) is distributed.

Using the fence painter anology again, if you were a customer getting your fence painted blue and the painter was almost finished your fence, how would you feel if the painter left because someone wanted a fence painted green, and came back to finish your fence 3 hours later? Would it help if the painter told you that green fences are higher priority than blue fences, and that you should consider yourself lucky because some other customer with a red fence has been waiting for 6 hours for the last 5 minutes work to be done on thier fence?

IMO, wrong argument. People just hate to see their fence being partially painted for having two distinct colors. It doesnt take away that having to drive between fences is more time consuming than taking on one fence at a time (but that's the argument we've seen already). I doubt processes have such an obvious negative effect (apart from said execution resources) on surrounding processes when they are not completed yet.

Threads that do a large amount of processing but complete in a finite amount of CPU time would use the lowest batch processing policy (e.g. Policy 3b).

This isnt really a satisfactory explanation: how would the shell know its launching a huge task, or do you intend to launch all threads as low priority interactive (3a) and expect them to switch to batch if that's the preferred approach?

Brendan · Post by **Brendan** » Fri Apr 20, 2007 8:40 pm

Hi,

Combuster wrote:
It's impossible to get more than one painter to paint the same fence at the same time - they wouldn't know which painter is responsible for which part of the fence, and the resulting conflict can seriously damage the fence. In this case the fence builder would need to build 3 smaller fences instead of one large fence. Unfortunately for some types of fences there's practical difficulties that prevent builders from splitting a large fence into several smaller fences (typically where each piece of the fence relies on previous pieces).
Not all pairs of painters are called "Bush" and "Kerry"

the point of having two painters is that they provide an extra pair of hands: the majority of the time they will both be painting, but occasionally the second painter will go collect and prepare a new can of paint while the first painter can continue painting. the first painter will be able to work at peak efficiency as it is always doing the same, while the second painter can get more accustomed to getting up and refilling.
The key to performance improvement is that because there is more paint used in the same time, larger cans can be used without the risk of drying out in the californian sun, reducing the time spent filling.

Consider hyperthreading: running one thread is poor since a lot of execution resources will go unused, running two different processes will cause the caches to be separated over the two processes, slightly impairing both processes (but potentially faster as more resources are actually used), while having two painters on one fence would be optimal: the fence (CPU cache) is shared while the work (register contents) is distributed.

For the painter anology (where a painter is a CPU and a fence is a thread), you can't have multiple painters per fence at the same time (or you can't have more than one CPU executing the same thread at the same time).

If a service uses multiple threads to complete a request, then the scheduler sees seperate threads and schedules each independantly. If a services uses one thread per CPU, then (hopefully) this results in all CPUs doing the request, but the scheduler still sees seperate batch processing threads.

Combuster wrote:
Using the fence painter anology again, if you were a customer getting your fence painted blue and the painter was almost finished your fence, how would you feel if the painter left because someone wanted a fence painted green, and came back to finish your fence 3 hours later? Would it help if the painter told you that green fences are higher priority than blue fences, and that you should consider yourself lucky because some other customer with a red fence has been waiting for 6 hours for the last 5 minutes work to be done on thier fence?
IMO, wrong argument. People just hate to see their fence being partially painted for having two distinct colors. It doesnt take away that having to drive between fences is more time consuming than taking on one fence at a time (but that's the argument we've seen already). I doubt processes have such an obvious negative effect (apart from said execution resources) on surrounding processes when they are not completed yet.

The key phrase is "average time until completion". Or more specifically, the average amount of time between a service receiving a request and when that request is completed.

Ignoring multi-CPU (which is identical anyway, if you replace "requests" with "requests per CPU"), let me compare different scheduling methods.

For simplicity let's assume all requests take 100 seconds, there's only 2 requests (request A and request B) and that request B starts 50 seconds after request A starts. I'll also ignore overhead caused by task switching.

For round robin scheduling (which is equivelent to variable time slice scheduling if tasks/requests are at the same priority), for the first 50 seconds the CPU is processing Request A. Then Request B is started and for the next 100 seconds the CPU does both requests (until Request A completes). For the last 50 seconds the CPU does Request B. Request A took 150 seconds, request B took 150 seconds, and the average time until completion is 150 seconds.

For prioritized scheduling where Request A is higher priority than Request B, for the first 50 seconds the CPU is processing Request A. Then Request B is started but doesn't use any CPU time because it's lower priority, and for the next 50 seconds the CPU continues doing request A (until Request A completes). For the last 100 seconds the CPU does Request B. Request A took 100 seconds, request B took 150 seconds, and the average time until completion is 125 seconds.

For prioritized scheduling where Request A is lower priority than Request B, for the first 50 seconds the CPU is processing Request A. Then Request B is started and preempts Request A, and for the next 100 seconds the CPU does request B (until Request B completes). For the last 50 seconds the CPU does Request A. Request A took 200 seconds, request B took 100 seconds, and the average time until completion is 150 seconds.

For batch processing, for the first 50 seconds the CPU is processing Request A. Then Request B is started but doesn't use any CPU time because it's the least recently started request, and for the next 50 seconds the CPU continues doing request A (until Request A completes). For the last 100 seconds the CPU does Request B. Request A took 100 seconds, request B took 150 seconds, and the average time until completion is 125 seconds.

Now for prioritized scheduling you can't guarantee that requests that are in progress will be highest priority - half the time they will be and half the time they won't be. Without knowing which priority the requests are you'd have to combine the average times for both cases, and end up with an average time until completion of 137.5 seconds.

For all cases, batch processing always gives the best average time until request completion. If you're lucky other techniques might give the same average times, but only if you're lucky.

For more complex cases you get the similar results. If there's 100 requests that all take a different amount of CPU time, where each request is started at different times, and there's 8 CPUs where requests are scheduled on different CPUs, then batch processing still reduces the average time requests take to complete.

Combuster wrote:
Threads that do a large amount of processing but complete in a finite amount of CPU time would use the lowest batch processing policy (e.g. Policy 3b).
This isnt really a satisfactory explanation: how would the shell know its launching a huge task, or do you intend to launch all threads as low priority interactive (3a) and expect them to switch to batch if that's the preferred approach?

A thread that starts a new process decides what policy/priority the initial thread of the new process uses (and it's maximum policy/priority). When a thread belonging to an existing process spawns a new thread it decides what policy/priority the new thread initially uses. Any thread can change it's policy/priority (up to it's maximum policy/priority).

A shell would probably start all new processes so that the initial thread uses Policy 2A with a priority of 128, and the new process' maximum might be Policy 1A with priority 128.

Cheers,

Brendan