Hi,
mystran wrote:Brendan, what are you actually trying to do? I mean, what is the fundamental goal of this all?
The main goal is scalability.
In general, as a system becomes more scalable it tends to use more tasks that do less work each, which means the scheduler also needs to deal with more smaller tasks that do less work each.
Traditional schedulers are designed more for a smaller number of threads that do more, and therefore additional schedululing algorithms like "batch processing" would be less effective for these systems.
For an (over-simplified) simple example, consider this:
Code: Select all
int find_sum(int max) {
int result = 0;
for(i = 0; i < max; i++) {
result += i;
}
return result;
}
This is standard single-threaded code that doesn't scale at all. To make this more scalable, you could try something like:
Code: Select all
int find_sum(int max) {
int workers;
int range;
int i;
threadID *threadList;
int result = 0;
workers = get_number_of_CPUs() - 1;
if( max / 4 < workers) {
workers = max / 4;
}
if(workers == 0) {
for(i = 0; i < max; i++) {
result += i;
}
return result;
}
range = max/(workers + 1);
threadList = malloc( sizeof(threadID) * workers);
for(i = 0; i < workers; i++) {
threadList[i] = spawnThread( &workerThread );
}
for(i = 0; i < workers; i++) {
sendMessage(threadList[i], SUM, i * range, (i + 1) * range);
}
for(i = workers * range; i < max; i++) {
result += i;
}
while(workers > 0) {
getMessage();
if(message.type == WORKER_RESULT) {
result += message.data1;
}
}
return result;
}
void workerThread(void) {
threadID client;
int start = 0;
int end = 0;
int i;
int result = 0;
while(end = 0) {
getMessage();
if(message.type == SUM) {
client = message.senderID;
start = message.data1;
end = message.data2;
}
}
for(i = start; i < end; i++) {
result += i;
}
sendMessage(client, result);
terminateThread();
}
So now you've got code that scales to the number of CPUs in the computer. This is good, but what about other computers?
For a distributed system, you could create a (multi-threaded) service that calculates the sum of a range of numbers, and then run this service on each computer. In this case you could have a main thread that sends a range of numbers to each service/computer and combines the replies into a final result. Of course then you could add redundancy - send each range of numbers to 2 different computers and check to make sure both computers return the same result.
Note: I know this example is a stupid example. Rather than using brute force a little thinking can go a long way (e.g. "result = max * (max - 1) / 2").
os64dev wrote:Brendan wrote:Policy 1b and Policy 2B make more sense once you understand that "terminate" isn't the same as "preempt".
Ok, i thought that you would put the batch back into the batch queue after the time-slice is finished but it seems that the time you give to the batch is actually its life time. So after that time the batch dies whether it finished its work or not. How do you handle the early death situation then? Because the batch has to be submitted again i guess.
I use a system for "critical errors".
When code does something wrong and the kernel can't just return an error code, it causes a critical error. This includes exceptions but also includes other detectable problems, read or write errors on memory mapped files, read errors on swap space and exceeding time limits set by the scheduler.
When a critical error occurs the kernel adds some details in the kernel log and may do some optional things (create a core dump, send an email to the developers, etc). In any case the thread itself is always terminated. Each thread has a "has recovery" flag. If this flag is set for the terminated thread and the process has a valid "recovery thread", then the kernel sends details of the critical error to the recovery thread (and the recovery thread is responsible for determining what happened, and either recovering from the mess left behind or terminating the rest of the process). Otherwise (if the flag is clear for the terminated thread or the process doesn't have a valid recovery thread), the kernel terminates the entire process.
Other threads (belonging to other processes) that may have been waiting for results can elect to receive "obituaries" (where if any thread or process is terminated the kernel sends a message saying so), and use these obituaries to detect when something it's waiting for is terminated. Alternatively other threads can use time-outs (i.e. if it hasn't received the reply in N seconds it can retry). This sort of thing is mostly required anyway, as something like unplugging a network cable can mean a cause undeliverable messages.
Cheers,
Brendan