Page 1 of 3
BCOS: Threads
Posted: Thu Jan 19, 2006 1:15 am
by Kevin McGuire
Hey,
Brendan wrote:
For example, your webserver could have 4 classes, "main", "connection", "cache" and "log". The main class and the connection class would be at the same priority, the cache class would be at higher priority and the log class would be at a lower priority. On a single CPU computer this works out to 3 threads, one for each priority.
On a computer with 2 CPUs, you only need one thread for the log class as it's not that important and there's only one instance of it. The cache class can be split into 2 threads where each thread caches roughly one half of the data (put the file name through a hash function to determine which file is cached by which thread). For the remaining classes I'd have one thread for the main class (only one instance) and half of the connections and another thread for the other half of the connections (with some sort of load balancing to determine which thread handles each new connection).
For a computer with 4 CPUs, you still only need one thread for the log class, the cache class would be split across 4 threads, and I'd probably go for one thread for the main class (only one instance) and 3 threads for the connections.
This example probably isn't too good as I'm not that familiar with the internals of a web server, and I do realise standard high level languages might have trouble with it, but it should illustrate the point.
You are very brilliant person with extraordinary creativity to originate such a unparagoned idea. I was curious about more details, as to how this works, and some examples of advantages or disadvantages? It sounds like a intriguing concept. How do these classes communicate with one another? And, if one was running on two threads how would it share data - I would enjoy a example, but I would love a exemplification of a standard or protocol for the OOP on you're system by you. Since, as you stated not everyone who writes something for you're system does not have to follow this architecture.
Re:BCOS: Threads
Posted: Thu Jan 19, 2006 5:42 am
by Brendan
Hi,
First, think of the "operating system" as a system of threads connected via. IPC. For the IPC, let's use messaging where the function to send a message looks like this:
Code: Select all
int send_message(THREADID receiver, U32 function, void *data, U32 data_size);
And there'd be a function to get a message:
Code: Select all
int get_message(THREADID *sender, U32 *function, void *buffer, U32 *data_size);
Both of these functions return an error code directly to the caller. These messaging functions aren't the same as those I use, but they are close enough.
In the middle of each thread would be a message handler, for example:
Code: Select all
my_thread() {
THREADID *sender;
U32 *function
U32 *data_size
void *buffer;
buffer = malloc(MESSAGE_BUFFER_SIZE);
for(;;) {
if( get_message(&sender, &function, buffer, &data_size) == OK) {
switch(function) {
case FOO:
break;
case BAR:
break;
default:
}
}
}
}
Now, imagine that every object (or each instance of a class) runs as a seperate thread. There'd be an "object identifier" that is the same as the thread ID, and methods would be "called" via. IPC. When a new object is created, you'd just spawn a thread. In this case, the class (or the code for the thread) becomes something like:
Code: Select all
my_class() {
THREADID *sender;
U32 *function
U32 *data_size
void *buffer;
OBJECT *object_data;
buffer = malloc(MESSAGE_BUFFER_SIZE);
object_data = malloc( sizeof(OBJECT) );
for(;;) {
if( get_message(&sender, &function, buffer, &data_size) == OK) {
switch(function) {
case MY_CLASS_METHOD1:
break;
case MY_CLASS_METHOD2:
break;
case MY_CLASS_DELETE_OBJECT:
free(object_data);
terminate_thread();
break;
default:
}
}
}
}
The next step would be to allow a single thread to handle all objects of the same class. In this case you couldn't allocate memory for the object's data during initialization, and the "object identifier" would initially be created by the thread's code and specified as part of the message data. The class (or the code for the thread) becomes something like:
Code: Select all
my_class() {
THREADID *sender;
U32 *function
U32 *data_size
void *buffer;
OBJECT *object_list[MAX_OBJECTS];
U32 highest_object_ID = 0;
U32 current_object_ID;
buffer = malloc(MESSAGE_BUFFER_SIZE);
for(;;) {
if( get_message(&sender, &function, buffer, &data_size) == OK) {
switch(function) {
case MY_CLASS_METHOD1:
current_object_ID = (U32 *)buffer;
break;
case MY_CLASS_METHOD2:
current_object_ID = (U32 *)buffer;
break;
case MY_CLASS_NEW_OBJECT:
highest_object_ID++;
current_object_ID = highest_object_ID;
object_list[current_object_ID] = malloc( sizeof(OBJECT) );
(U32 *)buffer = current_object_ID;
send_message(sender, OBJECT_CREATED, void *buffer, 4);
break;
case MY_CLASS_DELETE_OBJECT:
current_object_ID = (U32 *)buffer;
free( object_list[current_object_ID] );
break;
default:
}
}
}
}
For this, to create a new object you'd do something like:
Code: Select all
send_message(thread_ID_for_class, MY_CLASS_NEW_OBJECT, NULL, 0);
[continued]
Re:BCOS: Threads
Posted: Thu Jan 19, 2006 5:45 am
by Brendan
[continued]
What if there's 2 or more CPUs and you want to spread the load over all of them? In this case you'd want 2 or more threads for the same class, with objects spread roughly evenly between them. The "object identifier" will need to keep track of the thread ID and the object number within the thread. To make it a little more realistic, let's make it a "person" class where each object represents a person and the person's name is used for load sharing, and an "location" class where each object represents a location.
Code: Select all
#define PERSON_CLASS_PRIORITY 123
#define LOCATION_CLASS_PRIORITY 123
typedef struct {
THREADID *thread;
U32 object_number;
} OBJECT_IDENTIFIER;
typedef struct {
U8 age;
char name[255];
} PERSON_DATA;
typedef struct {
char name[255];
} LOCATION_DATA;
U32 threads_per_class;
THREADID person_threads[MAX_CPUS];
THREADID location_threads[MAX_CPUS];
person_class() {
THREADID *sender;
U32 *function
U32 *data_size
void *buffer;
PERSON_DATA *object_list[MAX_OBJECTS];
U32 highest_object_number = 0;
U32 current_object_number;
PERSON_DATA *person;
buffer = malloc(MESSAGE_BUFFER_SIZE);
for(;;) {
if( get_message(&sender, &function, buffer, &data_size) == OK) {
switch(function) {
case PERSON_CLASS_METHOD1:
current_object_number = (U32 *)buffer;
person = object_list[current_object_number];
break;
case PERSON_CLASS_METHOD2:
current_object_number = (U32 *)buffer;
person = object_list[current_object_number];
break;
case PERSON_CLASS_NEW_OBJECT:
highest_object_number++;
current_object_number = highest_object_number;
person = malloc( sizeof(PERSON_DATA) );
object_list[current_object_number] = person;
person->age = (char *)(buffer);
strcpy(person->name, (buffer + 1) );
(U32 *)buffer = current_object_number;
send_message(sender, OBJECT_CREATED, void *buffer, 4);
break;
case PERSON_CLASS_DELETE_OBJECT:
current_object_number = (U32 *)buffer;
free( object_list[current_object_number] );
break;
default:
}
}
}
}
location_class() {
THREADID *sender;
U32 *function
U32 *data_size
void *buffer;
LOCATION_DATA *object_list[MAX_OBJECTS];
U32 highest_object_number = 0;
U32 current_object_number;
LOCATION_DATA *location;
buffer = malloc(MESSAGE_BUFFER_SIZE);
for(;;) {
if( get_message(&sender, &function, buffer, &data_size) == OK) {
switch(function) {
case LOCATION_CLASS_METHOD1:
current_object_number = (U32 *)buffer;
location = object_list[current_object_number];
break;
case LOCATION_CLASS_METHOD2:
current_object_number = (U32 *)buffer;
location = object_list[current_object_number];
break;
case LOCATION_CLASS_NEW_OBJECT:
highest_object_number++;
current_object_number = highest_object_number;
location = malloc( sizeof(LOCATION_DATA) );
object_list[current_object_number] = location;
strcpy(location->name, buffer);
(U32 *)buffer = current_object_number;
send_message(sender, OBJECT_CREATED, void *buffer, 4);
break;
case LOCATION_CLASS_DELETE_OBJECT:
current_object_number = (U32 *)buffer;
free( object_list[current_object_number] );
break;
default:
}
}
}
}
[continued]
Re:BCOS: Threads
Posted: Thu Jan 19, 2006 5:46 am
by Brendan
[continued]
Code: Select all
U32 get_hash(char *name) {
U32 hash = 0;
U32 i = 0;
while(name[i] != 0) {
hash = hash << 4 + name[i];
}
hash = hash % threads_per_class;
return hash;
}
create_person(char *name, U8 age, void *message_buffer) {
THREADID class_thread;
class_thread = person_threads[get_hash(name)];
(char *)message_buffer = age;
strcpy( (message_buffer + 1), name);
send_message(class_thread, PERSON_CLASS_NEW_OBJECT, message_buffer, (strlen(name) + 2) );
}
create_location(char *name, void *message_buffer) {
THREADID class_thread;
class_thread = location_threads[get_hash(name)];
strcpy(message_buffer, name);
send_message(class_thread, LOCATION_CLASS_NEW_OBJECT, message_buffer, (strlen(name) + 1) );
}
main() {
void *buffer;
U32 i;
buffer = malloc(MESSAGE_BUFFER_SIZE);
threads_per_class = get_CPUs();
for(i = 0; i < threads_per_class; i++) {
person_threads[i] = spawn_thread(&person_class, PERSON_CLASS_PRIORITY);
location_threads[i] = spawn_thread(&location_class, LOCATION_CLASS_PRIORITY);
}
create_person("Fred Flinstone", 33, buffer);
create_person("Barney Rubble", 32, buffer);
create_location("Quarry", buffer);
create_location("First House", buffer);
}
[continued]
Re:BCOS: Threads
Posted: Thu Jan 19, 2006 5:47 am
by Brendan
[continued]
Finally, what if both of these classes are at the same priority? It doesn't make sense to use 2 threads per CPU when you only really need one. Combining the code for both classes isn't too hard....
Code: Select all
#define COMBINED_CLASS_PRIORITY 123
typedef struct {
THREADID *thread;
U32 object_number;
} OBJECT_IDENTIFIER;
typedef struct {
U8 age;
char name[255];
} PERSON_DATA;
typedef struct {
char name[255];
} LOCATION_DATA;
U32 threads_per_class;
THREADID combined_threads[MAX_CPUS];
combined_class() {
THREADID *sender;
U32 *function
U32 *data_size
void *buffer;
PERSON_DATA *person_object_list[MAX_OBJECTS];
U32 highest_person_object_number = 0;
U32 current_person_object_number;
PERSON_DATA *person;
LOCATION_DATA *location_object_list[MAX_OBJECTS];
U32 highest_location_object_number = 0;
U32 current_location_object_number;
LOCATION_DATA *location;
buffer = malloc(MESSAGE_BUFFER_SIZE);
for(;;) {
if( get_message(&sender, &function, buffer, &data_size) == OK) {
switch(function) {
case PERSON_CLASS_METHOD1:
current_person_object_number = (U32 *)buffer;
person = person_object_list[current_person_object_number];
break;
case PERSON_CLASS_METHOD2:
current_person_object_number = (U32 *)buffer;
person = person_object_list[current_person_object_number];
break;
case PERSON_CLASS_NEW_OBJECT:
highest_person_object_number++;
current_person_object_number = highest_person_object_number;
person = malloc( sizeof(PERSON_DATA) );
person_object_list[current_person_object_number] = person;
person->age = (char *)(buffer);
strcpy(person->name, (buffer + 1) );
(U32 *)buffer = current_person_object_number;
send_message(sender, OBJECT_CREATED, void *buffer, 4);
break;
case PERSON_CLASS_DELETE_OBJECT:
current_person_object_number = (U32 *)buffer;
free( person_object_list[current_person_object_number] );
break;
case LOCATION_CLASS_METHOD1:
current_location_object_number = (U32 *)buffer;
location = location_object_list[current_location_object_number];
break;
case LOCATION_CLASS_METHOD2:
current_location_object_number = (U32 *)buffer;
location = location_object_list[current_location_object_number];
break;
case LOCATION_CLASS_NEW_OBJECT:
highest_location_object_number++;
current_location_object_number = highest_location_object_number;
location = malloc( sizeof(LOCATION_DATA) );
location_object_list[current_location_object_number] = location;
strcpy(location->name, buffer);
(U32 *)buffer = current_location_object_number;
send_message(sender, OBJECT_CREATED, void *buffer, 4);
break;
case LOCATION_CLASS_DELETE_OBJECT:
current_location_object_number = (U32 *)buffer;
free( location_object_list[current_location_object_number] );
break;
default:
}
}
}
}
U32 get_hash(char *name) {
U32 hash = 0;
U32 i = 0;
while(name[i] != 0) {
hash = hash << 4 + name[i];
}
hash = hash % threads_per_class;
return hash;
}
create_person(char *name, U8 age, void *message_buffer) {
THREADID class_thread;
class_thread = combined_threads[get_hash(name)];
(char *)message_buffer = age;
strcpy( (message_buffer + 1), name);
send_message(class_thread, PERSON_CLASS_NEW_OBJECT, message_buffer, (strlen(name) + 2) );
}
create_location(char *name, void *message_buffer) {
THREADID class_thread;
class_thread = combined_threads[get_hash(name)];
strcpy(message_buffer, name);
send_message(class_thread, LOCATION_CLASS_NEW_OBJECT, message_buffer, (strlen(name) + 1) );
}
main() {
void *buffer;
U32 i;
buffer = malloc(MESSAGE_BUFFER_SIZE);
threads_per_class = get_CPUs();
for(i = 0; i < threads_per_class; i++) {
combined_threads[i] = spawn_thread(&person_class, PERSON_CLASS_PRIORITY);
}
create_person("Fred Flinstone", 33, buffer);
create_person("Barney Rubble", 32, buffer);
create_location("Quarry", buffer);
create_location("First House", buffer);
}
[continued]
Re:BCOS: Threads
Posted: Thu Jan 19, 2006 5:50 am
by Brendan
[continued]
Now, consider what you'd do if the number of objects becomes too large. On a conventional system you'd be in trouble, but for this sort of thing all you need to do is to spawn more threads:
Code: Select all
main() {
void *buffer;
U32 i;
buffer = malloc(MESSAGE_BUFFER_SIZE);
threads_per_class = get_CPUs() * 200;
for(i = 0; i < threads_per_class; i++) {
combined_threads[i] = spawn_thread(&person_class, PERSON_CLASS_PRIORITY);
}
create_person("Fred Flinstone", 33, buffer);
create_person("Barney Rubble", 32, buffer);
create_location("Quarry", buffer);
create_location("First House", buffer);
}
By adding 5 characters to the source code, I just expanded this to handle 500 GB of data on single CPU computers, 1000 GB on dual CPU computers and 4000 GB on computers with 8 CPUs. Not too hard considering that it'd be running on a 32 bit kernel where the entire linear address space is limited to 4 GB.
Next, let's say we want to know how many people's names start with an 'F' and end with an 'e'. First you'd need to add something like this to the "combined class" code:
Code: Select all
case PERSON_SEARCH_F_AND_E:
count = 0;
for(i = 0; i < highest_person_object_number; i++) {
person = person_object_list[i];
if(person != NULL) {
if( person->name[0] == 'F' ) {
if( person->name[strlen(person->name) - 1] == 'e' ) {
count++;
}
}
}
}
(U32 *)buffer = count;
send_message(sender, SEARCH_RESULTS, void *buffer, 4);
break;
Then you'd add a search function:
Code: Select all
U32 search_for_people_F_and_E(void *message_buffer) {
THREADID class_thread;
U32 count = 0;
U32 replies = 0;
U32 i;
for(i = 0; i < threads_per_class; i++) {
class_thread = combined_threads[i];
send_message(class_thread, PERSON_SEARCH_F_AND_E, NULL, 0);
}
while(results < threads_per_class) {
if( get_message(&sender, &function, buffer, &data_size) == OK) {
if(function == SEARCH_RESULTS) {
replies++;
count += (U32 *)message_buffer;
}
}
}
return count;
}
That wasn't too hard, but now we've got any number of CPUs doing searches in parallel.
Now consider what happens if this is run on a distributed OS, and you've got a pool of 20 dual core computers. With some small changes you could spawn a pair of threads on each computer and do this search on 40 CPUs at the same time. If "COMBINED_CLASS_PRIORITY" is set to "low priority" then people using these computers for normal office work probably won't even notice when a search is happening.
Of course all of this is just demonstration code - it doesn't actually do much and there's a lot missing (error detection, security, etc). It's also harder to write than "normal" code, but the OS will handle normal code and all of the things above...
Cheers,
Brendan
Re:BCOS: Threads
Posted: Thu Jan 19, 2006 5:55 am
by kataklinger
WOW!
Too much text for five posts
Why didn't you post one file with a lot of comments
Re:BCOS: Threads
Posted: Thu Jan 19, 2006 6:10 am
by Candy
Brendan wrote:
[continued]
Ehm... I know I can write large posts, but I've never gone over 3. Maybe we should send a petition to DF to upgrade his software to some form of software that does support /LARGE/ messages?
Re:BCOS: Threads
Posted: Thu Jan 19, 2006 12:15 pm
by Kevin McGuire
Like I said, "You are very brilliant person with extraordinary creativity to originate such a unparagoned idea".
That was the best exemplification of you're architecture you could have given. I only loath it, because I did not think of it.
I absolutely luxuriate in the design, and it serves being a adventitious platform for attainable applicability in today's world of technology.
Thanks.
Re:BCOS: Threads
Posted: Thu Jan 19, 2006 3:13 pm
by Colonel Kernel
This scheme reminds me of "Apartment threading" in COM...
Re:BCOS: Threads
Posted: Fri Jan 20, 2006 1:53 am
by Pype.Clicker
Colonel Kernel wrote:
This scheme reminds me of "Apartment threading" in COM...
Yep. It's a technique that can be encountered in several object-oriented distributed software engines ... though the specific fact that you spawn one thread per CPU is something i couldn't tell whether i saw already or not.
What still remains a mystery to me is that "hashing" stuff. Does it actually "binds" a new object to a given thread/cpu as i think it does ?
Re:BCOS: Threads
Posted: Fri Jan 20, 2006 2:27 am
by Kevin McGuire
[edit] Pype does not need a new member like me repeating what he says... Let me put this at the top.
Pype, you are right. I feel it works the same way.
[edit] I need to delete this whole thing.. and go to bed.
It translates the text name into a index in the array:
THREADID combined_threads[MAX_CPUS];
He uses the MOD operator, which will never return a result larger than the number of threads created for the object.
Thats smart, I like it. It just returns a index into the array, like a fast way to load balance the threads randomly.
I still figured you could just load balance them by:
Code: Select all
cur_thread_to_send_to++;
if(cur_thread_to_send_to > total_threads)
cur_thread_to_send_to = 0;
That would seem to get better results, in case you had a buncha names that hashed it to the same thread. He must have had a reason for demostrating it.
Pype, I swear I am not trying to explain this to you. You been doing this stuff way longer than me.. Now im lost as to have even posted this.. hehe.. Im going to leave it just to acknowledge I actually read his code.
And, if you old hackers want to poke fun at me thats fine. Im expecting to see candy popup and say something some time soon.
Re:BCOS: Threads
Posted: Fri Jan 20, 2006 2:41 am
by Candy
kmcguire wrote:
Im expecting to see candy popup and say something some time soon.
Ok then. I've been reading through Brendan's megapost just now (didn't have the state of mind before) and all I can get out of it is the same basic idea as Rational Rose Realtime, plus a slight bit of using one thread for more than one object. He also codes "object-oriented classes" in C, which I consider to be kind of wasteful, plus there are default ways of doing RMI. You should read up on those things if this design appeals to you.
I consider it to be not the best design. If any of the objects wanted to do an operation with the system, it'd have to be converted to nonblocking or async operations, no object can have a big calculation loop or the rest suffers, gang scheduling is almost a requirement for acceptable performance and why reduce the amount of threads (usually for reducing the overhead of thread switching) when you're going to switch like hell because of this IPC and near-continual waits?
My idea about multithreading is to make the thread switch a very light operation, allowing threads to be put wherever you'd like them. More to the point, you can use a thread for each client if that makes your design a lot clearer. You can make a thread per file you analyse, letting them store data into a fifo that's by design multi-thread compatible, plus using a fully multithread compliant STL. That way, multithreaded design is a trivial thing to add, easier than it's now in Unix due to all standard things kind of assuming you don't multithread and a whole lot easier than in Windows, where the entire subsystem can't handle multithreading properly (so they create apartments).
Re:BCOS: Threads
Posted: Fri Jan 20, 2006 3:24 am
by Brendan
Hi,
kmcguire wrote:I still figured you could just load balance them by:
Code: Select all
cur_thread_to_send_to++;
if(cur_thread_to_send_to > total_threads)
cur_thread_to_send_to = 0;
That would seem to get better results, in case you had a buncha names that hashed it to the same thread. He must have had a reason for demostrating it.
That would distribute load more evenly and it'd also be faster. For the example code, the hash stuff isn't really necessary.
The reason I did the hash thing is that it allows for lookups by name (or by the "Primary Key" in database terminology).
For example, imagine you want a function to find a person's age, like this:
Code: Select all
int get_age_of_person(char *name);
The name itself may have been typed in by the user or something, and you don't know what the correct "object identifier" is, or if the object actually exists.
In this case you can use the hash function to find out which thread has the correct object (if it exists), and then ask that one thread to find the age of the person instead of asking all threads.
To be honest, I was thinking about caching in a web server when I started writing the example - minimizing the overhead of finding the correct cache entry from a URL.
Cheers,
Brendan
Re:BCOS: Threads
Posted: Fri Jan 20, 2006 3:30 am
by Kevin McGuire
Yeah. It makes sense not to have to switch CR3 as much as possible. It wastes CPU cycles. And, for a web-server you would want to thread it up so that clients do not sit waiting in a line, instead they start getting something right away, and using the CR3 switch is going to be a performance drain when it could have been made simplier by using the process and thread model. Even for the example he uses above, it would be smarter to not switch CR3, it is no reason to do so expect to protect threads from each other. He will definity need to put that thread to sleep or release its time-slice, it is going to waste alot of CPU just spinning waiting for data with that IPC.
But, He did say alot of things we not finished so except for that crazy thread stuff its looks like a interesting design.