OSDev.org

Posted: **Wed May 10, 2006 11:06 am**

Does anyone know of a clustering OS that will share threads across machines?

openMOSIX (heavily modified linux distro IIRC) is pretty good, but only shares processes. I just wondered if there were any that shared threads across the nodes.

I've got permission to hijack a bunch of computers for an 'experiment' (read: to play with) during the summer holidays at school, and while I'm thinking of ideas for what to calculate (primes are the most obvious idea, but I'm still thinking), I wondered about this.

Obviously sharing threads makes things a little more difficult than just sharing whole processes (impractical, maybe?), but I was just curious, because it might make parallelisation of a task a little easier.

Posted: **Wed May 10, 2006 11:54 am**

hi,

I don't think that this is a practical approach. I guess it would be very slow, because you have to copy every page (over network!) everytime one of the thread writes to it. Additionally you have to synchronize all more global objects (like file handles, etc...) between the computers. What makes this even nearly impossible is the fact, that these global objects must be of the same value, so that if thread1 passes a file handle to thread2 (on the other computer), then it must be possible for thread2 to access the file via this file handle. Another thing that might be impossible, is that if one thread allocates memory all other threads must be halted or it might occure, that two threads try to allocate memory at the same time and that these memory locations are overlapping...

So I don't think it's possible & fast

Posted: **Wed May 10, 2006 1:03 pm**

Distributed shared memory is definitely possible, assuming your threads have separate localities, so they don't request each others pages all the time. Even then it's possible, but makes no sense.

In fact 'distributed shared memory' might be a pretty good term to try on Google. I remember some old research operating systems doing something similar to what you are asking for.. and there are surely a lot of younger projects.

In fact, try 'distributed operating systems' too.

I'm not going to give any particular names here, because (1) I don't remember what system does what, and (2) I'm too lazy to check anything right now.

The following chunk of random text however explains how it works. Written for moderately technical people that can't immediately think of a solution. Skip if you don't care (or expect to understand, whatever).

---

On regular multiprocessor system with processors having memory caches in each processor, a processor caches physical memory that it accesses. When two processors attempt to access the same piece of memory, a cache coherency protocol is used, so that both processors can't write to different cached copies of the same page at the same time.

One simple way to do this is to let any processor read any memory, provided that no processor is writing to the same memory, and only let one processor write to a piece of memory at a time. So at any given moment, there's any number of read-only copies OR a single read-write copy.

In virtual memory environments pages of physical memory are used as a cache for pages of virtual memory in some secondary storage like a harddisk. When processors use the same physical memory, it is a single cache, and we don't need to care, since the processors have a builtin system to deal with such stuff on next level (processor caches).

But since standard virtual memory hardware and software support allows us to add and remove pages to and from processes at will, without processes noticing (unless they guess it from how much time has elapsed), we can use (for example) the multiple-readers or single writer protocol between virtual storage, and the physical memory.

So now, within the cluster, any number of machines can be reading the same page, if nobody is writing to it. If somebody is writing to it, then others can't have it present, and the processes need to wait.

The main complication is that you need to implement the coherency protocol on top of slow (annoying) and unreliable (problematic) network links. But if we accept the (potentially prohibitive) performance penalty for writing heavily accessed parts of memory, and assume that no machine will crash or network link be broken, then this is pretty trivial to implement. In theory at least.

Now we can have the same virtual memory area visible on multiple machines, without code using it having to really care. Because the code using it need not care, it need not care on which machine it runs either. And since process is mainly an area of virtual memory and some threads, we can have the processes distributed, and the threads jump from machine to other machine as the operating system sees fit.

Notice that we only need to make a copy over network when we want to read (or write to) a piece of memory that some other process has written to, since we last copied it (if ever).

Ofcourse you don't need to make it transparent, but in any case it's possible, and can actually be practical too.

Also, the above text assumes the multiple-readers/single-writer policy, which is easy to understand, simple to implement, and provides total coherency. If we are willing to increase complexity and/or relax the coherency constraints, then we can use policies which need even less synchronization.

Example:

Relax policy so that updates need not be visible elsewhere immediately. Now we only need "at most one writer" synchronization. Readers can continue to read the old data. When the writer is happy, or before giving write access to some other process, we replace all copies with the new data. Now you might not see the latest content, but for some purposes it might not be important.

---

So basicly, "distributed shared memory" == "threads of the same process running on different machines".

And if it's made totally transparent, then you can run normal multithreading code distributed over a network.

Posted: **Wed May 10, 2006 1:28 pm**

I guessed there would be a fairly heavy hit on network traffic - but these machines are mostly on 1 gigabit NICs and the core's on 1 gigabit too, and there will be virtually nobody else using the computers (and I'm about 90% sure there'll be no one using that subnet). Besides, the kind of things I'm thinking of aren't so much memory intensive as CPU-heavy and requiring a reasonably low latency for probably a medium number of tiny to small packets.

Thanks mystran for the information, very interesting, and the terms are yielding fairly promissing results on Google. Lots of interesting details too. Very informative, ta.

Perhaps this is possible, after all.

Posted: **Wed May 10, 2006 3:14 pm**

The question, ofcourse, is why do I know how to implement such a thing.

The reason is, that I've been playing with the idea of supporting something like that in my OS, if it ever gets to the point of having a network stack.

OSDev.org

Clustered (I.E. Distributed computing) OS?

Clustered (I.E. Distributed computing) OS?

Re:Clustered (I.E. Distributed computing) OS?

Re:Clustered (I.E. Distributed computing) OS?

Re:Clustered (I.E. Distributed computing) OS?

Re:Clustered (I.E. Distributed computing) OS?