Hi,
Brendan wrote:
Hyperdrive wrote:Locking in distributed systems isn't that easy.
For message passing, with careful design no locks are needed. Basically each small/simple piece has exclusive access to it's own data.
For an example, imagine something like a telephone directory (a database of names, with address and phone numbers). You could have one thread for each letter of the alphabet, so that all the names that begin with the letter 'A' are only ever handled by thread #0, names beggining with 'B' are handled by thread #1, etc. Then (if necessary) you have a main thread that receives requests and forwards them to the appropriate worker thread. For example, you might want to add an entry for "Fred", so the main thread forwards this request to worker thread #5, and worker thread #5 creates the new entry with no locking at all because thread #5 is the only thread that has access to the data.
[snipped]
here you have switched the approach. That's the very mistake that is often made: the prospensity to centralize things. Your main thread is a central point, the coordinator. There's no true distribution. The positive effects you gain come from having a central coordinator, not from doing message passing smartly.
Also your approach lacks the ability to have data "as close as possible". If someone requests the phone number for "Fred" it asks your main thread, waits, and gets the data. Fine. But you can't cache the data, because Fred could have moved and now has a different phone number. Every time someone wants to call Fred (or whomever) he has to ask the database.
My approach is that any node could have any data at any time: replication and migration both allowed. Your approach allows neither of them.
Brendan wrote:
This should scale fairly well until the single main thread becomes a bottleneck. If that's likely you'd just have more main threads. In this case the main threads would need to synchronize things with each other (e.g. the atomic operations and load balancing would need to be coordinated, but normal requests could be done without any coordination).
Ah, here we go... At the time you de-centralize you get into trouble. Now image you have n main threads in a system with n nodes, each running one main thread...
Also, all of this could be implemented one step at a time. For example, start with the simple worker thread then build on that. This is made easier because the main thread/s don't need to care how the worker threads operate, and the worker threads don't need to care how the main thread/s operate. The only thing that matters is they talk the same protocol. Maybe this is just a simple example, but then maybe it's just a small piece of something larger, where the "something larger" only needs to talk the same protocol as the main thread/s and doesn't need to know anything about how the main thread/s operate (or that the worker threads even exist).
Brendan wrote:
For the same thing with distributed transactional shared memory, if each computer shares the same data then you end up false sharing,
Admittedly, false sharing is a severe problem.
Brendan wrote: and even if you made everything 4 KiB large to avoid that
That won't fix it in all cases, but that's another story...
Brendan wrote: then you'd still be constantly waiting for pages of data to be transfered over the network.
Why? You can replicate (i.e. cache) data. Only if the page was invalidated, you have to request it again. Writes are made less often than reads. And even more often there are only a few nodes that write. So collisions are rare and mostly the data is cached.
Brendan wrote:Of course you could split things apart (just like with the messaging example) to avoid this, but then there's no need for any shared memory or any transactional memory in the first place (except for creating a "performance crippled" emulation of messaging).
There's no need to split things apart. If data is shared then communication is needed (and you get higher latencies) - regardless if messaging or distributed memory is used. If the data isn't used by someone else (i.e. no sharing), there's no communication - regardless if messaging or the distributed memory is used.
Brendan wrote:
To make it worse, the first version of the code might be both easy to write and correct, but it'll also be unusable due to horrendous performance, and the programmer won't know what to do to fix it because the OS is hiding all the details (and eventually after several rewrites, the unfortunate programmer will come to realise there's nothing they can do to make the performance acceptable, and that adding more hardware just makes things worse).
Well, I think for a first write the performance would be "acceptable". And I admit, that programmers need a good understanding how the TDM works, to improve performance. And yes, the room for improvements is somewhat limited because there is a high level of abstraction. So the OS has to provide as many optimizations as possible.
And if nothing is helping you can always get away with providing weaker consistency models (transactional consistency is a very strong one). Then the application can decide which consistency model fits and uses that one. Or better: Provide a simple (!) interface to plug-in consistency models. That's not impossible - we have plans to do so.
Brendan wrote:Basically to me it looks like you've done everything possible to make programming easier, which is why I'm curious about the scalability/performance...
Hyperdrive wrote:
Around 80% of the IT costs in a company is due to buggy software. (That's just
my estimation, I don't have some figures at hand, but see for example
http://www.nist.gov/public_affairs/releases/n02-10.htm?). Bugs come, among other things, from overly complex software. And distributed environments are even more complex. Then the main idea is to reduce complexity, i.e. making programming easier. By reducing complexity for the programmer, it's very likely we see a significant decrease in bug counts. Lowering bug counts means less money wasting.
I agree - using a
large/expensive server instead of a distributed system would reduce complexity. Complexity is only needed for better performance (e.g. if the application needs more processing power than one expensive server can provide).
A large/expensive server is just the same. It's only very tightly coupled. The same arguments apply.
Brendan wrote:
NIST takes a different approach - to reduce the cost of buggy software, make it easier to find bugs. From your own link:
NIST wrote:The path to higher software quality is significantly improved software testing. Standardized testing tools, suites, scripts, reference data, reference implementations and metrics that have undergone a rigorous certification process would have a large impact on the inadequacies currently plaguing software markets. For example, the availability of standardized test data, metrics and automated test suites for performance testing would make benchmarking tests less costly to perform. Standardized automated testing scripts, along with standard metrics, also would provide a more consistent method for determining when to stop testing.
Providing the link to the NIST report I aimed at substantiating my argument that buggy software is incredibly costly (using the headline as an "eye-catcher"). As you said, they take a different approach. I don't think they do the right thing.
They don't solve the problem, they just propose using better tools to see the problem more clearly. Read as: They don't
prevent bugs to be made, they just put more effort in finding the bugs. Once you have to do extensive testing, you've already lost the war. NIST just closes the doors more loudly than others after their horses have bolted. [Hm... As a non-native speaker I hope you'll understand this proverb variation... Do you?]
Brendan wrote:
Hyperdrive wrote:The other way around -- increasing performance gets harder the more you do it. It's relatively easy to get a 30% increase when your application is performing at 50% of the theoretical maximum. But it's incredibly hard to squeeze another 1% out of your application when it's performing at 97% of the theoretical max. Besides, an increase of performance saves you less money than eliminating bugs.
Except the performance problem is caused by the memory model used by the OS, which isn't something an application can fix. If an application programmer does everything they possibly can to improve performance but the application still only performs at %10 of theoretical maximum performance, then is it realistic to ask them to rewrite the OS and the application to squeeze another 85% out of it?
I don't think that's the case. I agree that message passing could outperform the TDM (if well done), but I won't expect such horrible performances for the TDM. Granted, I have no proof - but you don't have one either.
Brendan wrote:
Hyperdrive wrote:Bottom line: Good software is better then good performance. So making programming easier is better than to push the envelope in terms of performance.
For desktop systems, I agree completely. The type of distributed system you're designing isn't used for desktop systems - it's mainly used for
HPC. On average, programmers that write software for desktop use are "average". Programmers that write software for large servers are typically "above average", and programmers that write software for HPC are "above above average" (in that they've been writing software that scales to > 1000 nodes for a decade or so already). I'm not convinced that these "above above average" programmers really need something that makes programming easier at the expense of performance.
Why not using it for desktop systems? We have a virtual 3D world as a use case on top of our TDM. It's an "desktop application", isn't it? I would expect there are more applications for which a TDM is a excellent approach.
For HPC you need the above above programmers because they can handle the complexity the programming model imposes if you want to write well scaling software. But they are humans also and so the software they write has bugs, too. I'd expect above above programmers could also write well scaling TDM software - but they don't have such a hard time to get it right and there will be fewer bugs.
And as I said, performance isn't everything. Does the best performance always justify a long development time + much money for paying the above above programmers (they don't work for free) + downtimes due to bugs (that's the very costly part)?
Brendan wrote:Hyperdrive wrote:Brendan wrote:
In all cases, a sane programmer will split a large/complex application into small/simple pieces. For message passing, each small/simple piece has it's own small/simple protocol - it only become complex if you fail to split it into manageable pieces. However, this depends a lot on what sort of messaging - asynchronous messaging is more complex than synchronous messaging (but has better performance/scalability).
The pieces themselves are manageable. But by splitting you often increase the need for communication in distributed environments. If your application is growing, it gets harder to do it right. Which component needs data? Which data? When? How fast (timeouts)? How is it reachable? And even if you can answer all that, can you guarantee that the whole system has a consistent view?
Take WoW for example, which uses message passing. Why did they choose a client/server approach? Every avatar movement is reported to the server and other clients rely on the view the server provides. They could have choosen to do it more like peer to peer, but they didn't. Why? It's just too hard. Maintaining consistency with message passing is incredibly hard and implementing that very error-prone.
Peer-to-peer only works if you can trust the peers, and for WoW there's probably plenty of people willing to hack the game's code to get an advantage. Client/server means the (trusted) server can double check everything that it receives from the (untrusted) clients to make sure nothing dodgy is going on.
Sorry, I don't think that's the point. It's only an excuse and not that bad - but nothing more. Apart from the trust/security issues, I strongly doubt they could get the whole thing right in a fully distributed fashion.
--TS