i have a bunch of computers all running the same daemon. The daemons are designed to work together. the network will get BIG BIG BIG and i need to make sure every single computer on the network gets a message. I need to make sure no computer gets it twice and that a computer does not need to keep in memory a HUGE list of ip addresses because that would eat up memory. Each computer only knows about the existence of 10 other computers due to a heartbeat.
i need to find an answer to this problem.
ANY help is appreciated.
making sure every computer on a network gets a message
Re: making sure every computer on a network gets a message
Hi,
For "client-server", all clients need to know one IP address (for the server) and the server needs to know all IP addresses (for all clients). A client would tell the server to broadcast a message and the server would send it out to all clients.
For "hierarchical client-server", all computers need to know the IP address of the "master server", and the IP addresses of the computers connected "downstream". Each client would tell the master server to broadcast a message and the master server would send it to the computers directly under it, the computers directly under the master server would forward the message to computers under them, and so on until all computers have received the message. This might work well if you're using ethernet subnets and can make use of a broadcast packet (e.g. one computer on each subnet uses a broadcast packet to send the message to all other computers on the same subnet at the same time).
For "peer-to-peer" every computer needs to know the IP address of every other computer. In this case it costs you about 4 bytes per IP address per computer (e.g. for 10000 computers it costs about 40 KB of RAM on each computer, and for 1 million computers it'd be about 4 MB of RAM on each computer). Not sure how you'd define "BIG BIG BIG", but this seems practical to me (unless you're thinking of every computer on the internet). The bonus here is that there's no double handling (i.e. no need to send a message up to a server before anything is broadcast). The bigger problem is the bandwidth available to each client (e.g. if a client is using a 56 K dial-up connection then sending 1 million messages from it might take a while).
You could combine approaches though. For example, for 1 million computers you could nominate one thousand "group controllers" (where each group controller controls 1000 computers). In this case each client only needs to know 1000 IP addresses (one for each group controller). When a message is broadcast it's sent to each group controller, and each group controller sends it to each computer in it's group. In this case, if N equals the square root of the maximum number of computers supported, then you'd use N * 4 bytes of RAM on each client, N * 4 * 2 bytes of RAM on each "group controller", and each computer needs enough bandwidth to be able to send N messages in a suitable amount of time.
Of course this is only a "2 level" system. For a 3 level system, N equals the maximum number of computers to the power of one third. For a million computers, you'd have 100 "master controllers" that know about 100 "group controllers", that each know about 100 clients. By expanding on this idea you could have a "6 level system", where (for a million computers) there's 10 "level 1 controllers", that each know about 10 "level 2 controllers", that each know about 10 "level 3 controllers", etc.
By taking this idea to extremes you could end up with an "N level binary system", where there's 2 "master controllers", and each controller knows about 2 other computers below it. This would have bad latency though - the time it takes for a message to find it's way from a master controller down to the furtherest client would be the sum of the latencies of all intermediate controllers (but RAM usage and bandwidth requirements would be extremely small).
The problem with this combined approach is figuring out which computers are "controllers", how each client finds out who the top level controllers are, and how each controller finds out who it's responsible for controlling (especially for a dynamic network, and/or if you want to nominate a new controller if an existing controller dies, and/or if you want to optimise it to minimise latency/bandwidth).
For a "chaotic" configuration (no structure at all), the only alternative I can think of is for each message to have a unique identifier, and for each computer to send the message to any/all computers it knows about. If a message is received you'd compare the unique identifier with previously received messages, and ignore it if it's been received already.
Anyway, none of this probably makes much sense, but it (hopefully) will give you some ideas....
Cheers,
Brendan
What sort of structure are these computers/daemons arranged in?iammisc wrote:i have a bunch of computers all running the same daemon. The daemons are designed to work together. the network will get BIG BIG BIG and i need to make sure every single computer on the network gets a message. I need to make sure no computer gets it twice and that a computer does not need to keep in memory a HUGE list of ip addresses because that would eat up memory. Each computer only knows about the existence of 10 other computers due to a heartbeat.
For "client-server", all clients need to know one IP address (for the server) and the server needs to know all IP addresses (for all clients). A client would tell the server to broadcast a message and the server would send it out to all clients.
For "hierarchical client-server", all computers need to know the IP address of the "master server", and the IP addresses of the computers connected "downstream". Each client would tell the master server to broadcast a message and the master server would send it to the computers directly under it, the computers directly under the master server would forward the message to computers under them, and so on until all computers have received the message. This might work well if you're using ethernet subnets and can make use of a broadcast packet (e.g. one computer on each subnet uses a broadcast packet to send the message to all other computers on the same subnet at the same time).
For "peer-to-peer" every computer needs to know the IP address of every other computer. In this case it costs you about 4 bytes per IP address per computer (e.g. for 10000 computers it costs about 40 KB of RAM on each computer, and for 1 million computers it'd be about 4 MB of RAM on each computer). Not sure how you'd define "BIG BIG BIG", but this seems practical to me (unless you're thinking of every computer on the internet). The bonus here is that there's no double handling (i.e. no need to send a message up to a server before anything is broadcast). The bigger problem is the bandwidth available to each client (e.g. if a client is using a 56 K dial-up connection then sending 1 million messages from it might take a while).
You could combine approaches though. For example, for 1 million computers you could nominate one thousand "group controllers" (where each group controller controls 1000 computers). In this case each client only needs to know 1000 IP addresses (one for each group controller). When a message is broadcast it's sent to each group controller, and each group controller sends it to each computer in it's group. In this case, if N equals the square root of the maximum number of computers supported, then you'd use N * 4 bytes of RAM on each client, N * 4 * 2 bytes of RAM on each "group controller", and each computer needs enough bandwidth to be able to send N messages in a suitable amount of time.
Of course this is only a "2 level" system. For a 3 level system, N equals the maximum number of computers to the power of one third. For a million computers, you'd have 100 "master controllers" that know about 100 "group controllers", that each know about 100 clients. By expanding on this idea you could have a "6 level system", where (for a million computers) there's 10 "level 1 controllers", that each know about 10 "level 2 controllers", that each know about 10 "level 3 controllers", etc.
By taking this idea to extremes you could end up with an "N level binary system", where there's 2 "master controllers", and each controller knows about 2 other computers below it. This would have bad latency though - the time it takes for a message to find it's way from a master controller down to the furtherest client would be the sum of the latencies of all intermediate controllers (but RAM usage and bandwidth requirements would be extremely small).
The problem with this combined approach is figuring out which computers are "controllers", how each client finds out who the top level controllers are, and how each controller finds out who it's responsible for controlling (especially for a dynamic network, and/or if you want to nominate a new controller if an existing controller dies, and/or if you want to optimise it to minimise latency/bandwidth).
For a "chaotic" configuration (no structure at all), the only alternative I can think of is for each message to have a unique identifier, and for each computer to send the message to any/all computers it knows about. If a message is received you'd compare the unique identifier with previously received messages, and ignore it if it's been received already.
Anyway, none of this probably makes much sense, but it (hopefully) will give you some ideas....
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: making sure every computer on a network gets a message
Are you looking at something like the 'wall' command in Unix? or is this something else?iammisc wrote:i have a bunch of computers all running the same daemon. The daemons are designed to work together. the network will get BIG BIG BIG and i need to make sure every single computer on the network gets a message. I need to make sure no computer gets it twice and that a computer does not need to keep in memory a HUGE list of ip addresses because that would eat up memory. Each computer only knows about the existence of 10 other computers due to a heartbeat.
i need to find an answer to this problem.
ANY help is appreciated.
Only Human
Re: making sure every computer on a network gets a message
Why do I get the feeling you're asking about a botnet or something similar?iammisc wrote:i have a bunch of computers all running the same daemon. The daemons are designed to work together. the network will get BIG BIG BIG and i need to make sure every single computer on the network gets a message. I need to make sure no computer gets it twice and that a computer does not need to keep in memory a HUGE list of ip addresses because that would eat up memory. Each computer only knows about the existence of 10 other computers due to a heartbeat.
A normal acyclic graph is equal to a tree. A directed acyclic graph is not what you want since some will receive duplicates.Combuster wrote:You could make an acyclic graph of TCP connections and use that for broadcasting like that. Best would be to have the links being geographically close, so you dont get the same message running over the same link many times.
How dense will the network be?
the computers are arranged in a chaotic configuration. these servers are staying on for a long time so there needs to be some way for these unique identifiers to expire.
Basically, each computer sends a udp broadcast every so often. If a computer receives this broadcast and it doesnt have enough ip addresses in its list, the computer adds the newest computer.
Basically, each computer sends a udp broadcast every so often. If a computer receives this broadcast and it doesnt have enough ip addresses in its list, the computer adds the newest computer.
I would use a Broadcast address or Multicast address
Microsoft: "let everyone run after us. We'll just INNOV~1"