OSDev.org

Posted: **Thu Nov 09, 2006 6:25 pm**

i have a bunch of computers all running the same daemon. The daemons are designed to work together. the network will get BIG BIG BIG and i need to make sure every single computer on the network gets a message. I need to make sure no computer gets it twice and that a computer does not need to keep in memory a HUGE list of ip addresses because that would eat up memory. Each computer only knows about the existence of 10 other computers due to a heartbeat.

i need to find an answer to this problem.

ANY help is appreciated.

Posted: **Fri Nov 10, 2006 12:41 am**

Hi,

iammisc wrote:i have a bunch of computers all running the same daemon. The daemons are designed to work together. the network will get BIG BIG BIG and i need to make sure every single computer on the network gets a message. I need to make sure no computer gets it twice and that a computer does not need to keep in memory a HUGE list of ip addresses because that would eat up memory. Each computer only knows about the existence of 10 other computers due to a heartbeat.

What sort of structure are these computers/daemons arranged in?

For "client-server", all clients need to know one IP address (for the server) and the server needs to know all IP addresses (for all clients). A client would tell the server to broadcast a message and the server would send it out to all clients.

For "hierarchical client-server", all computers need to know the IP address of the "master server", and the IP addresses of the computers connected "downstream". Each client would tell the master server to broadcast a message and the master server would send it to the computers directly under it, the computers directly under the master server would forward the message to computers under them, and so on until all computers have received the message. This might work well if you're using ethernet subnets and can make use of a broadcast packet (e.g. one computer on each subnet uses a broadcast packet to send the message to all other computers on the same subnet at the same time).

For "peer-to-peer" every computer needs to know the IP address of every other computer. In this case it costs you about 4 bytes per IP address per computer (e.g. for 10000 computers it costs about 40 KB of RAM on each computer, and for 1 million computers it'd be about 4 MB of RAM on each computer). Not sure how you'd define "BIG BIG BIG", but this seems practical to me (unless you're thinking of every computer on the internet). The bonus here is that there's no double handling (i.e. no need to send a message up to a server before anything is broadcast). The bigger problem is the bandwidth available to each client (e.g. if a client is using a 56 K dial-up connection then sending 1 million messages from it might take a while).

You could combine approaches though. For example, for 1 million computers you could nominate one thousand "group controllers" (where each group controller controls 1000 computers). In this case each client only needs to know 1000 IP addresses (one for each group controller). When a message is broadcast it's sent to each group controller, and each group controller sends it to each computer in it's group. In this case, if N equals the square root of the maximum number of computers supported, then you'd use N * 4 bytes of RAM on each client, N * 4 * 2 bytes of RAM on each "group controller", and each computer needs enough bandwidth to be able to send N messages in a suitable amount of time.

Of course this is only a "2 level" system. For a 3 level system, N equals the maximum number of computers to the power of one third. For a million computers, you'd have 100 "master controllers" that know about 100 "group controllers", that each know about 100 clients. By expanding on this idea you could have a "6 level system", where (for a million computers) there's 10 "level 1 controllers", that each know about 10 "level 2 controllers", that each know about 10 "level 3 controllers", etc.

By taking this idea to extremes you could end up with an "N level binary system", where there's 2 "master controllers", and each controller knows about 2 other computers below it. This would have bad latency though - the time it takes for a message to find it's way from a master controller down to the furtherest client would be the sum of the latencies of all intermediate controllers (but RAM usage and bandwidth requirements would be extremely small).

The problem with this combined approach is figuring out which computers are "controllers", how each client finds out who the top level controllers are, and how each controller finds out who it's responsible for controlling (especially for a dynamic network, and/or if you want to nominate a new controller if an existing controller dies, and/or if you want to optimise it to minimise latency/bandwidth).

For a "chaotic" configuration (no structure at all), the only alternative I can think of is for each message to have a unique identifier, and for each computer to send the message to any/all computers it knows about. If a message is received you'd compare the unique identifier with previously received messages, and ignore it if it's been received already.

Anyway, none of this probably makes much sense, but it (hopefully) will give you some ideas....

Cheers,

Brendan

Posted: **Fri Nov 10, 2006 1:03 am**

iammisc wrote:i have a bunch of computers all running the same daemon. The daemons are designed to work together. the network will get BIG BIG BIG and i need to make sure every single computer on the network gets a message. I need to make sure no computer gets it twice and that a computer does not need to keep in memory a HUGE list of ip addresses because that would eat up memory. Each computer only knows about the existence of 10 other computers due to a heartbeat.

i need to find an answer to this problem.

ANY help is appreciated.

Are you looking at something like the 'wall' command in Unix? or is this something else?

Posted: **Fri Nov 10, 2006 2:27 pm**

iammisc wrote:i have a bunch of computers all running the same daemon. The daemons are designed to work together. the network will get BIG BIG BIG and i need to make sure every single computer on the network gets a message. I need to make sure no computer gets it twice and that a computer does not need to keep in memory a HUGE list of ip addresses because that would eat up memory. Each computer only knows about the existence of 10 other computers due to a heartbeat.

Why do I get the feeling you're asking about a botnet or something similar?

Posted: **Fri Nov 10, 2006 11:24 pm**

no im not making a botnet.

Posted: **Sat Nov 11, 2006 7:02 am**

You could make an acyclic graph of TCP connections and use that for broadcasting like that. Best would be to have the links being geographically close, so you dont get the same message running over the same link many times.

Posted: **Sat Nov 11, 2006 9:35 am**

Combuster wrote:You could make an acyclic graph of TCP connections and use that for broadcasting like that. Best would be to have the links being geographically close, so you dont get the same message running over the same link many times.

A normal acyclic graph is equal to a tree. A directed acyclic graph is not what you want since some will receive duplicates.

How dense will the network be?

Posted: **Sat Nov 11, 2006 12:59 pm**

the computers are arranged in a chaotic configuration. these servers are staying on for a long time so there needs to be some way for these unique identifiers to expire.

Basically, each computer sends a udp broadcast every so often. If a computer receives this broadcast and it doesnt have enough ip addresses in its list, the computer adds the newest computer.

Posted: **Sun Nov 19, 2006 3:47 am**

I would use a Broadcast address or Multicast address

OSDev.org

making sure every computer on a network gets a message

making sure every computer on a network gets a message

Re: making sure every computer on a network gets a message

Re: making sure every computer on a network gets a message

Re: making sure every computer on a network gets a message