Page 1 of 1

Local APIC self interrupt

Posted: Wed Aug 14, 2013 9:44 am
by Congdm
Hi,

I have a problem that I cannot find the answer. When a CPU issues an IPI to itself, does the interrupt happen immediately or the CPU can execute more instructions? Are there anyone who know about this?

Thanks.

Re: Local APIC self interrupt

Posted: Wed Aug 14, 2013 10:50 am
by Owen
Well, of course it depends: are interrupts enabled, is the interrupt's priority lower than the CPU's current priority, is there a higher priority interrupt pending?

So, potentially the interrupt could be waiting indefinitely (like any other interrupt)

Lets suppose it isn't: The datasheets don't say anything. What we do know is:
  • The first local APIC was an external chip and therefore ran asynchronously from the CPU
  • Modern local APICs remain somewhat asynchronous (i.e. they process interrupt requests independently from the CPU's instruction execution, and prod the CPU when an interrupt is pending)
  • Modern CPUs are superscalar and can have hundreds of instructions in flight at once. Even if the local APIC asserts the interrupt signal instantaneously, there could be a hundred more instructions already in flight which need to be drained from the pipeline first.
So, from that angle we can assume that the interrupt probably won't be triggered on the immediately following cycle, but we didn't need to go to that trouble.

I already mentioned that the Intel manuals say nothing; from that, we should do what one should always do with undefined behavior: be strict in what you produce and liberal in what you accept. Or: From your own side, follow the letter of the documentation as closely as possible (unless experiment determines that the hardware and documentation are not in agreement), and be as liberal in what you accept as possible.

But, most of all, consider why you're wanting to use self-interrupts at all. The only use of them I know in any existing OS is NT using them to "post" an event to be executed when the CPU's interrupt priority level drops.

Re: Local APIC self interrupt

Posted: Wed Aug 14, 2013 8:47 pm
by Congdm
Thank for the info, I have read Intel manual many times but can not find anything.
Owen wrote:But, most of all, consider why you're wanting to use self-interrupts at all. The only use of them I know in any existing OS is NT using them to "post" an event to be executed when the CPU's interrupt priority level drops.
I use IPI (broadcast including self) to make all CPUs jump to the garbage collector function when the heap run out of memory. If IPI self-interrupt doesn't happen immediately, I will use broadcast excluding self with an additional INT instruction.

Re: Local APIC self interrupt

Posted: Thu Aug 15, 2013 7:05 am
by Brendan
Hi,
Congdm wrote:I use IPI (broadcast including self) to make all CPUs jump to the garbage collector function when the heap run out of memory. If IPI self-interrupt doesn't happen immediately, I will use broadcast excluding self with an additional INT instruction.
One of the things I normally have is a generic "doParallel_allCPUs(functionPointer, data1, data2)" function. It acquires a lock, sets a "completed CPUs" variable to zero, stores the functionPointer, data1 and data2 in global variables and then sends a "broadcast to all but self". The interrupt handler used by other CPUs gets functionPointer, data1 and data2 from the global variables and calls it, then increments "completed CPUs" before returning to whatever they interrupted. The initiating CPU calls the function pointer directly, increments "completed CPUs", then waits until "completed CPUs == total CPUs" before releasing the lock.

Once you've got something like that you can use it for many different things (including "multi-CPU TLB shootdown", your garbage collection, etc). You can also have different code for different scenarios - one for the single-CPU case, one for xAPIC, one for x2APIC, etc (e.g. so that in a single-CPU system "doParallel_allCPUs()" does almost nothing).

Using the same basic idea you can implement different versions for different purposes too - lower priority and higher priority versions (using the IRQ priority scheme), versions where all CPUs wait until "completed CPUs == total CPUs" before returning, versions that collect and return information in some way (e.g. return the sum of values from all CPUs), etc. Also (possibly with a little "logical destination" trickery to make it very efficient), you can assign attributes to CPUs and have all CPUs with that attribute do something (e.g. you can have a "first CPU in core" attribute and implement a "doParallel_firstCPUinCore()" function, or whatever else you can think of).


Cheers,

Brendan