Hi,
Jules wrote:The easiest way of dealing with deadlocks is to make sure they never happen. Avoid the possibility of a circular path like that. If you need to have modules A and B that are mutually dependent, split one of them into two threads that don't have those dependencies. In your specific case, making one of them use an asynchronous rather than synchronous call might work.
This is correct - either use asynchronous IPC, or use synchronous IPC but write the code so it doesn't call anything else. For example:
- A 'calls' B.
B stores something on a work queue
B returns to A
B gets CPU time
B looks at it's work queue
B calls A
The problem with this approach is keeping track of what uses what, especially when the system becomes complex. For example, if A uses B which might use C that sometimes uses D which uses A (where each part is written by a different person that doesn't know much about the other parts). To prevent this problem it's possible to assign "levels". When code at a higher level uses code at a lower level then syncrhonous methods are fine, but when code at a lower level uses code at a higher level then asynchronous methods must be used. Then you only need some way for people to know what "level" their code is, rather than knowing the internal details of every (external) piece of code they rely on.
If possible, I'd also recommend adding code to detect deadlocks. For example, if A synchronously calls B and B is waiting for a synchronous reply, then you could check if B is waiting for a reply from A, or if B is waiting for C which is waiting for A, etc (i.e. follow the chain of who is waiting for who to see if anything in that chain is waiting for A). Once you've detected the deadlock you can display an error (including a list of all modules involved).
I just use asynchronous IPC for everything. This solves the deadlock problems easily, but makes other things a bit harder because you have no way of knowing what will occur when.
For example:
- A send FOO to B
A waits to receive a message
B receives FOO from A and starts working on it
B sends BAR to A
B waits to receive a message
A receives BAR from B and starts working on it
A send BAR_REPLY to B
A waits to receive a message
B receives BAR_REPLY from A and continues working on FOO
B sends FOO_REPLY to A
B waits to receive a message
A receives FOO_REPLY from B
It looks so easy (no deadlocks, no deadlock detection, no documentation requirements for deciding which functions are synchronous and what level things are at) until you see the code - for e.g. here's what you'd need for B:
Code: Select all
void main(void) {
for(;;) {
getMessage(&message);
switch(message.type) {
case FOO:
handle_FOO();
case BAR_REPLY:
handle_BAR_REPLY();
}
}
}
void handle_FOO(void);
/* Do some processing for FOO*/
sendMessage(BAR);
store_details_of_FOO_somewhere();
}
void handle_BAR_REPLY() {
try_to_find_details_of_FOO();
if( details_of_FOO_found) {
/* Do some more processing for FOO */
sendMessage(FOO_REPLY);
}
}
As you can see, it's not as easy as sychronous IPC because the processing for FOO is split into 2 different parts, which can make following the code hard (especially for larger things). For synchronous IPC the IPC itself could be hidden within header files so that it looks like a normal function call, which makes it much easier to work with in comparison. For example:
Code: Select all
void main(void) {
for(;;) {
getMessage(&message);
switch(message.type) {
case FOO:
handle_FOO();
}
}
}
void handle_FOO(void);
/* Do some processing for FOO*/
getResultFromBAR();
/* Do some more processing for FOO */
sendMessage(FOO_REPLY);
}
I tend to prefer asynchronous IPC because it can easily handle many requests without unnecessary delays. For an example, if 4 different applications send "FOO" to the code above, the asychronous version can work on the next "FOO" while it waits for "BAR_REPLY" (i.e. multiple requests can be in progress at the same time). For the sychronous version, all subsequent requests must wait for the current request to complete - i.e. if an application takes ages to send "BAR_REPLY" then the code above stalls and all other applications that are waiting for the code above also stall.
To prevent these unnecessary delays in a synchronous systems, programmers tend to use more threads (e.g. one thread per request, or a pool of worker threads with work queues), which complicates things again - you end up needing re-entrancy protection and (possible deadlocks) between threads belonging to the same process.
I guess it's a compromise - the complexities of inter-process deadlock avoidance (and intra-process deadlock avoidance where necessary); or the complexities of asynchronous IPC.
Cheers,
Brendan