Hi,
rdos wrote:Seems like BIOS in this system doesn't put the core to sleep in the proper manner. Perhaps they forget to disable interrupts?
IOW, it seems like it is necessary to boot all AP cores, and then put them to sleep in the proper manner until they are needed.
There's isn't enough information here to form a conclusive conclusion. It's like saying "I'm wet, therefore it must be raining" - the assumption might be correct, but then again you could be wet for a completely different reason.
Normally (as I understand it), CLI has no effect on whether or not the local APIC accepts an IRQ. CLI only effects if/when the local APIC delivers a received interrupt to the CPU. For example, with interrupts disabled the local APIC can receive 20 different IRQs (and set the corresponding flags in its "Interrupt Received Register") and then wait until the CPU enables interrupts again. When the CPU does enable interrupts again the local APIC delivers the previously received IRQs to the CPU in order (highest priority IRQ first).
From this I'd assume that the BIOS probably does disable IRQs; and the problem is that the AP CPU's local APIC receives "lowest priority delivery" IRQs anyway. This is entirely possible if the AP CPU's "Logical Destination Register" is non-zero. I also have a suspicion that if the interrupt was sent with "destination = 0xFF" then it's treated as "all CPUs" regardless of whether it's "fixed mode" or "logical mode", and regardless of the local APIC's "Logical Destination Register".
Note: This definitely is the case for x2APIC, but I'm not sure about xAPIC (including CPUs that support x2APIC but are running in "xAPIC mode").
I don't know if you program the IO APIC with "destination = 0xFF", but if you do, don't.
You also don't say if this behaviour occurs on a cold boot, or if it only occurs after a soft reset. For cold boot, the default contents of the local APIC's logical destination register is zero, and in that case it shouldn't accept "lowest priority delivery" IRQs at all (regardless of whether the BIOS does "CLI" or not). Perhaps your OS boots and sets the logical destination register, then you reset the computer and the BIOS doesn't clear the logical destination register, which leaves it non-zero and able to accept "lowest priority delivery" interrupts (regardless of whether the BIOS does "CLI" or not).
At a minimum, I'd be tempted to start the AP CPU and display the contents of all of its local APIC registers to see what the BIOS left in each of them. I guess you should also display the contents of the AP CPU's EFLAGS at startup too (but that shouldn't matter much).
Cheers,
Brendan