There's a fairly good writeup of some of the issues NMIs can cause here. There's also a writeup of how linux decided to nest its NMIs. It seems like some other potential issues have popped up about how to handle NMIs safely on the x86-64 architecture.
- Using SYSCALL/SYSRET could cause the kernel to have an unverified stack, cause problems if an NMI (possibly MCE too?) occurs before the syscall handler can swap in a valid stack, requiring the use of an IST for NMIs and MCEs.
- SMIs can trigger during NMIs and possibly issue and IRET disabling the processors internal NMI masking logic
- MCEs might also be able to trigger in a fashion similar to the SMI and disable NMI masking logic
- With the processor's internal NMI masking disable you could get a sequence of NMI-SMI-NMI early enough in the handler that the return values for the initial NMI will be overwritten before they can be reliably saved or mirrored elsewhere.
Overall I get the feeling that this kind of begs the question does this come down to a design decision? And likewise is there some other chipset logic that perhaps guarantees a certain grace period to handle NMI nesting in a sane fashion?
FreeBSD handler is here.
I know this issue has come up a few times as of late, but it seems like a bit of a gray area where absolute fault tolerance might indicate one pattern of NMI handling while performance considerations would dictate another. Seems to be a big gray area if there ever was one.