Hi,
Congdm wrote:I use IPI (broadcast including self) to make all CPUs jump to the garbage collector function when the heap run out of memory. If IPI self-interrupt doesn't happen immediately, I will use broadcast excluding self with an additional INT instruction.
One of the things I normally have is a generic "doParallel_allCPUs(functionPointer, data1, data2)" function. It acquires a lock, sets a "completed CPUs" variable to zero, stores the functionPointer, data1 and data2 in global variables and then sends a "broadcast to all but self". The interrupt handler used by other CPUs gets functionPointer, data1 and data2 from the global variables and calls it, then increments "completed CPUs" before returning to whatever they interrupted. The initiating CPU calls the function pointer directly, increments "completed CPUs", then waits until "completed CPUs == total CPUs" before releasing the lock.
Once you've got something like that you can use it for many different things (including "multi-CPU TLB shootdown", your garbage collection, etc). You can also have different code for different scenarios - one for the single-CPU case, one for xAPIC, one for x2APIC, etc (e.g. so that in a single-CPU system "doParallel_allCPUs()" does almost nothing).
Using the same basic idea you can implement different versions for different purposes too - lower priority and higher priority versions (using the IRQ priority scheme), versions where all CPUs wait until "completed CPUs == total CPUs" before returning, versions that collect and return information in some way (e.g. return the sum of values from all CPUs), etc. Also (possibly with a little "logical destination" trickery to make it very efficient), you can assign attributes to CPUs and have all CPUs with that attribute do something (e.g. you can have a "first CPU in core" attribute and implement a "doParallel_firstCPUinCore()" function, or whatever else you can think of).
Cheers,
Brendan