Distributed shared memory is definitely possible, assuming your threads have separate localities, so they don't request each others pages all the time. Even then it's possible, but makes no sense.
In fact 'distributed shared memory' might be a pretty good term to try on Google. I remember some old research operating systems doing something similar to what you are asking for.. and there are surely a lot of younger projects.
In fact, try 'distributed operating systems' too.
I'm not going to give any particular names here, because (1) I don't remember what system does what, and (2) I'm too lazy to check anything right now.
The following chunk of random text however explains how it works. Written for moderately technical people that can't immediately think of a solution. Skip if you don't care (or expect to understand, whatever).
---
On regular multiprocessor system with processors having memory caches in each processor, a processor caches physical memory that it accesses. When two processors attempt to access the same piece of memory, a cache coherency protocol is used, so that both processors can't write to different cached copies of the same page at the same time.
One simple way to do this is to let any processor read any memory, provided that no processor is writing to the same memory, and only let one processor write to a piece of memory at a time. So at any given moment, there's any number of read-only copies OR a single read-write copy.
In virtual memory environments pages of physical memory are used as a cache for pages of virtual memory in some secondary storage like a harddisk. When processors use the same physical memory, it is a single cache, and we don't need to care, since the processors have a builtin system to deal with such stuff on next level (processor caches).
But since standard virtual memory hardware and software support allows us to add and remove pages to and from processes at will, without processes noticing (unless they guess it from how much time has elapsed), we can use (for example) the multiple-readers or single writer protocol between virtual storage, and the physical memory.
So now, within the cluster, any number of machines can be reading the same page, if nobody is writing to it. If somebody is writing to it, then others can't have it present, and the processes need to wait.
The main complication is that you need to implement the coherency protocol on top of slow (annoying) and unreliable (problematic) network links. But if we accept the (potentially prohibitive) performance penalty for writing heavily accessed parts of memory, and assume that no machine will crash or network link be broken, then this is pretty trivial to implement. In theory at least.
Now we can have the same virtual memory area visible on multiple machines, without code using it having to really care. Because the code using it need not care, it need not care on which machine it runs either. And since process is mainly an area of virtual memory and some threads, we can have the processes distributed, and the threads jump from machine to other machine as the operating system sees fit.
Notice that we only need to make a copy over network when we want to read (or write to) a piece of memory that some other process has written to, since we last copied it (if ever).
Ofcourse you don't need to make it transparent, but in any case it's possible, and can actually be practical too.
Also, the above text assumes the multiple-readers/single-writer policy, which is easy to understand, simple to implement, and provides total coherency. If we are willing to increase complexity and/or relax the coherency constraints, then we can use policies which need even less synchronization.
Example:
Relax policy so that updates need not be visible elsewhere immediately. Now we only need "at most one writer" synchronization. Readers can continue to read the old data. When the writer is happy, or before giving write access to some other process, we replace all copies with the new data. Now you might not see the latest content, but for some purposes it might not be important.
---
So basicly, "distributed shared memory" == "threads of the same process running on different machines".
And if it's made totally transparent, then you can run normal multithreading code distributed over a network.