Hi,
kiznit wrote:Are you targeting games with this? And/or are games going to be running over this networked system? If that is the case, then my answer is that it is orders of magnitude more efficient to synchronize the game simulation then it is to send the rendering graphs over and over. That's how game engines work. That's because it is the most efficient way.
It's likely that my plans are far more convoluted than you currently imagine. I'm not sure how familiar you are with my project from other posts, or where to start, but...
Imagine a word processor. Typically (on traditional OSs) this would be implemented as a single process; for my OS it's not. For my OS you'd have a "back end" process that deals with things like file IO and managing the document's data; plus a second "front end" process that deals with the user; plus a third process that does spell check. Processes are tied to a specific computer; and by splitting applications up into multiple processes like this the load is spread across multiple computers. There are other advantages, including making it easier to have 20 different applications (written by different people) all using the same spell checker at the same time, making it easier to have 2 or more "front ends" connected to the same "back end" (for multi-user apps), making it so that one company can write a new front end even when the original word processor was a proprietary application written by a different company, etc. There are also disadvantages (different programming paradigm, network lag/traffic, a much stronger need for fault tolerance, etc).
For the sake of an example, let's assume we're designing a game like
Gnomoria (for no reason other than it's the game I played most recently). Just like the word processor; this would be split into cooperating processes and distributed across a LAN. You might have one process for the user interface, one process doing physics (mostly just liquids), one doing flora (tree, plant and grass growth), one for enemy AI, one or more for path finding, one for tracking/scheduling jobs for your gnomes to perform, one for mechanisms, etc. In this way a single game may be spread across 10 computers. For most games you'll notice there's almost always one or more limits to keep processing load from growing beyond a single computer's capabilities. For Gnomoria, there's 2 of them - map size is limited to (at most) 192*192 tiles; and there's a soft limit on population growth. By spreading load across multiple computers these limits can be pushed back - rather than limiting a kingdom to 192*192 on one computer, maybe this limit could be increased to 512*512 when running on 10 computers; and rather than limiting population to ~100 gnomes, maybe you can allow 500 gnomes.
Some of these processes may be running on the same computer, and some of them might be running on different computers. The programmer only sends messages between processes without caring where the processes are. The design of applications and games (and how they're split into multiple processes) has to take this into account. Essentially you need to find pieces where the processing load is high enough to justify the communication costs.
You also need to design messaging protocols to reduce communication. For example, when the spell checker is started you wouldn't want to have to send a dictionary to it from another process; you'd want to send a file name or (more likely) a language code and let it get the dictionary from disk itself; and you'd want to tell it the language you want as far in advance as possible so that it (hopefully) has time to finish loading the dictionary before you start asking it to check any spelling.
Now, a detour...
In theory; when a computer boots the OS's boot code sets a default video mode (using firmware) then (later) the OS does PCI bus enumeration, finds video cards and starts their native drivers. In practice there's no native video drivers so the OS starts a "generic framebuffer" driver for each video card/monitor the firmware let it setup during boot, which is typically limited to one and only one monitor (because, firmware isn't too fancy). Once the video driver is started it uses a file to figure out which "station" its connected to and establishes a connection to that station's "station manager". Other human interface devices (keyboard, mouse, sound, etc) do the same. A station is basically where a user would sit. If one computer has 4 monitors and 2 keyboards, then maybe you configure one station for 3 monitors and one keyboard and another station for 1 monitor and 1 keyboard.
Of course it's a micro-kernel; all drivers are just processes, and processes communicate without caring which computer they're on. If you want multiple monitors but firmware (and lack of native drivers) limits the OS to one monitor per computer, then that's fine - just configure a station so that monitors from 2 or more computers connect to the same "station manager".
Once a station manager has enough human interface device drivers (not necessarily all of them) it does user login; and when a user logs in the station manager loads the user's preferences and starts whatever the user wants (e.g. a GUI for each virtual desktop). The station manager is the only thing that talks to human interface device drivers. Apps/games/GUI talk to the station manager. For example; (for video) an app sends a description of its graphics (as a dependency graph) to the station manager, it keeps a copy and sends a copy to each video driver. If a video driver crashes (or is updated) then another video driver is started and the station manager sends it a copy of those descriptions without the application or GUI knowing the video driver had a problem and/or was changed. Of course you can have an application on one computer, a station manager on another computer, and 2 video drivers on 2 more computers.
Also; there isn't too much difference between boring applications (text editor, calculator) and 3D games - they all send 3D graphics to the station manager/video driver (and if someone wants to look at your calculator from the side I expect that the buttons protrude like "[" and not be painted on a 2D surface and impossible to see from the side).
That should give you a reasonable idea of what the graphics API (or more correctly, the messaging protocol used for graphics) needs to be designed for.
So, latency...
Worst case latency (time between app sending anything and video driver receiving it) for a congested network can exceed 1/60th of a second. Due to the nature of the OS, you can expect the network to be fairly busy at best. Anything that involves "per frame" data going in either direction is completely unusable. The graphics is not 3D, it's 4D. An application doesn't say "chicken #3 is at (x, y, z)" it says "at time=123456 in the future, chicken #3 will be at (x, y, z)". The video driver figures out when the next frame will be displayed, calculates where objects will be at that time, then renders objects at the calculated locations. Applications/games (wherever possible) predict the future. A player presses a mouse button which starts a 20 ms "pulling the trigger" animation and the game knows a bullet will leave the gun 20 ms before it happens (and with 10 ms of lag, the video driver does the animation in 10 ms and the bullet leaves the gun at exactly the right time despite the lag).
"
An object at rest will remain at rest unless acted on by an unbalanced force. An object in motion continues in motion with the same speed and in the same direction unless acted upon by an unbalanced force". We only need to tell the video driver when an object will be created, destroyed, or when it will be acted upon by an unbalanced force. This includes the camera.
Then there's bandwidth and memory consumption...
The descriptions of what the application wants to draw need to be small, not just because they consume network bandwidth but also because the descriptions are stored in the "station manager" and in all video card drivers. Textures and meshes are big, file names are small, give the video driver file names and let it load textures/meshes from disk itself. Alpha channel bitmaps are big, UTF8 strings are relatively small, give the video driver a UTF8 string and let it get the alpha channel bitmaps from the font engine itself.
And finally, performance...
Ask for everything to be prefetched as soon as possible. While the user is fumbling about with your game's main menu you could be loading the most recently used save game while the video driver is loading graphics data. If the user loads a different save game, you can cancel/discard the "accidentally prefetched" data. If the video driver has more important things to be doing, IO priorities are your friend. Don't just give video driver the file name for a texture to load, also give it a default colour to use if/when the video driver hasn't been able to load the texture from disk before it needs to be displayed. For meshes do the same (but use a radius and a colour). I want to play a game now; I do not want to wait for all this graphics data to get loaded before I start blowing things up just so you can show me awesome high-poly monkey butts; and if I do want to wait I'm not too stupid to pause the game until the video driver catches up.
Video driver should reduce quality to meet a frame's deadline if it has to; and graphics should be locked to the monitor's refresh rate. The app doesn't need to know when frames are due (it's using prediction) or what the screen resolution is. Video driver should also be smart enough to figure out if there's some other renderer somewhere that it can offload some work to (e.g. get them to do some distant imposers, up to just before rasterisation to avoid shoving pixels around); and don't forget about the prediction that's hiding latency (you can ask another renderer to start work 10 frames ahead if you want, and cancel later if the prediction changes). I don't want to be a guy sitting in a room surrounded by 10 computers where 9 of them are idle just because one is failing to delegate.
Now; with all of the above in mind; which do you think is better - immediate mode, or retained mode?
Cheers,
Brendan