Bochs spurious interrupt

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
8infy
Member
Member
Posts: 185
Joined: Sun Apr 05, 2020 1:01 pm

Bochs spurious interrupt

Post by 8infy »

Hi guys, I currently have a small problem in bochs:

I get one spurious IRQ in bochs each time my os starts and I have more then 1 threads in the scheduler queue.
The spurious IRQ happens on first task switch and then never occurs again. I tried removing every line of code that does EOI but that doesn't help.
How is that even possible? I thought spurious IRQs only happend when you acknowledged a non-existent interrupt.
(it only happens in bochs, qemu works fine)

Heres a screenshot of logs on bochs (interesting thing to note is that the main thread did a few cycles before switching, maybe thats related):
Image

And here's qemu, no spurious IRQs here but the main thread also didn't do a single cycle and got interrupted right after doing sti:
Image

Any ideas why this happens?
Thanks :)

Update:
I just tried adding some logging to the end_of_interrupt function to see if it gets called and the spurious interrupt disappeared WTFFFFFF
(it only disappears in cases where I dont ever do EOI in my code, if I do I still get a spurious IRQ)

Code: Select all

void PIC::end_of_interrupt(u8 request_number, bool spurious)
{
    if (request_number >= 8 && !spurious)
        IO::out8<slave_command>(end_of_interrupt_code);
    log() << "here"; // ADDING ANY RANDOM CODE ON THIS LINE REMOVES THE SPURIOUS IRQ (IT'S NEVER ACTUALLY LOGGED, AND THIS FUNCTION IS NEVER ACTUALLY CALLED)????????????????
    IO::out8<master_command>(end_of_interrupt_code);
}
nexos
Member
Member
Posts: 1081
Joined: Tue Feb 18, 2020 3:29 pm
Libera.chat IRC: nexos

Re: Bochs spurious interrupt

Post by nexos »

That is very strange that a spurious interrupt would occur in a VM because they generally indicate hardware faults. Those can not happen on a VM. Have you disabled the interrupt or changed its priority before EOI was sent? You will probably have to debug this and find out what line is causing the interrupt. The reason why you it doesn't do it when you don't send EOI is because it can't send the interrupt until it is acknowledged.
"How did you do this?"
"It's very simple — you read the protocol and write the code." - Bill Joy
Projects: NexNix | libnex | nnpkg
8infy
Member
Member
Posts: 185
Joined: Sun Apr 05, 2020 1:01 pm

Re: Bochs spurious interrupt

Post by 8infy »

nexos wrote:That is very strange that a spurious interrupt would occur in a VM because they generally indicate hardware faults. Those can not happen on a VM. Have you disabled the interrupt or changed its priority before EOI was sent? You will probably have to debug this and find out what line is causing the interrupt. The reason why you it doesn't do it when you don't send EOI is because it can't send the interrupt until it is acknowledged.
Yes I remapped the pic but I've been doing it for a while and never encountered spurious interrupts.

Here's the full log from PIC/PIT from bochs, I dont see anything useful here...
Image
8infy
Member
Member
Posts: 185
Joined: Sun Apr 05, 2020 1:01 pm

Re: Bochs spurious interrupt

Post by 8infy »

Ok after some more digging I realize how to reproduce/fix this:

In my kernel entry point, if too much time is spent with interrupts disabled when I do sti I get a spurious interrupt right after, this can be trivially reproduced by changing 1 constant:

Code: Select all

void run(MemoryMap memory_map)
{
    runtime::ensure_loaded_correctly();

    HeapAllocator::initialize();

    MemoryManager::inititalize(memory_map);

    runtime::init_global_objects();

    PageDirectory::inititalize();

    InterruptDisabler::increment();

    GDT::the().create_basic_descriptors();
    GDT::the().install();
    new PIT;
    ISR::install();
    IRQManager::the().install();
    SyscallDispatcher::initialize();
    IDT::the().install();

    Scheduler::inititalize();

    // TESTING AREA
    // ----------------------------------------- //
    Process::create_supervisor(dummy_kernel_process);

    auto page = MemoryManager::the().allocate_page();
    PageDirectory::of_kernel().map_page(0x0000F000, page->address(), false);

    // Changing 1028 to a bigger number here leads to a spurious IRQ when interrupts are enabled again.             <---------------------------------------
    // Can also be reproduced by adding a for loop somewhere here, which does something for a while.
    copy_memory(reinterpret_cast<void*>(userland_process), reinterpret_cast<void*>(0x0000F000), 1028);

    Process::create(0x0000F000);
    // ----------------------------------------- //

    InterruptDisabler::decrement();

    static auto           cycles = 0;
    static constexpr u8   color  = 0x4;
    static constexpr auto row    = 3;

    for (;;) {
        auto offset = vga_log("Main process: working... [", row, 0, color);

        static constexpr size_t number_length = 21;
        char                    number[number_length];

        if (to_string(++cycles, number, number_length))
            offset = vga_log(number, row, offset, color);

        vga_log("]", row, offset, color);
    }
}
Any idea why this happens and if this is expected?
nullplan
Member
Member
Posts: 1791
Joined: Wed Aug 30, 2017 8:24 am

Re: Bochs spurious interrupt

Post by nullplan »

8infy wrote:In my kernel entry point, if too much time is spent with interrupts disabled when I do sti I get a spurious interrupt right after, this can be trivially reproduced by changing 1 constant:
Well, then it is obvious.

A spurious interrupt occurs when an interrupt source signals an interrupt to the PIC, but withdraws that signal before the CPU wanted to know the interrupt number. Therefore, if you disable your interrupts for too long, something will trigger (probably a timer interrupt), and it might have moved on by the time you STI. I seem to remember that the PIT can be made to keep accurate time without delay for the CPU to acknowledge the interrupt. Or maybe it is some other hardware. Anyway, by the time you STI, the PIC no longer knows the interrupt source, but it has to give an interrupt number, and therefore gives 7.

I'm just impressed the Bochs guys programmed this behavior into their VM.

You know the interrupt source in question is going to be one of the first seven, so I'm guessing it's the timer.
Carpe diem!
8infy
Member
Member
Posts: 185
Joined: Sun Apr 05, 2020 1:01 pm

Re: Bochs spurious interrupt

Post by 8infy »

nullplan wrote:
8infy wrote:In my kernel entry point, if too much time is spent with interrupts disabled when I do sti I get a spurious interrupt right after, this can be trivially reproduced by changing 1 constant:
Well, then it is obvious.

A spurious interrupt occurs when an interrupt source signals an interrupt to the PIC, but withdraws that signal before the CPU wanted to know the interrupt number. Therefore, if you disable your interrupts for too long, something will trigger (probably a timer interrupt), and it might have moved on by the time you STI. I seem to remember that the PIT can be made to keep accurate time without delay for the CPU to acknowledge the interrupt. Or maybe it is some other hardware. Anyway, by the time you STI, the PIC no longer knows the interrupt source, but it has to give an interrupt number, and therefore gives 7.

I'm just impressed the Bochs guys programmed this behavior into their VM.

You know the interrupt source in question is going to be one of the first seven, so I'm guessing it's the timer.
That's kinda crazy if true. Can anyone confirm that that's a thing in bochs?

But yeah, looks like you're right. Here I have literally 1 thread in the scheduler queue so no switching happens.
Image

Literally same executable but I changed the for loop so it logs 10 times instead of 10000
Image
Octocontrabass
Member
Member
Posts: 5574
Joined: Mon Mar 25, 2013 7:01 pm

Re: Bochs spurious interrupt

Post by Octocontrabass »

8infy wrote:Can anyone confirm that that's a thing in bochs?
It is indeed a thing in Bochs.

Note the "IRQ line 0 now low" message that occurs shortly before the spurious IRQ. If you've left interrupts disabled long enough for the IRQ 0 line to go high and then low again, you'll receive a spurious IRQ.
8infy
Member
Member
Posts: 185
Joined: Sun Apr 05, 2020 1:01 pm

Re: Bochs spurious interrupt

Post by 8infy »

Octocontrabass wrote:
8infy wrote:Can anyone confirm that that's a thing in bochs?
It is indeed a thing in Bochs.

Note the "IRQ line 0 now low" message that occurs shortly before the spurious IRQ. If you've left interrupts disabled long enough for the IRQ 0 line to go high and then low again, you'll receive a spurious IRQ.
Cool, thanks. Does real hardware do this as well? What should I do to avoid this if anything?
Octocontrabass
Member
Member
Posts: 5574
Joined: Mon Mar 25, 2013 7:01 pm

Re: Bochs spurious interrupt

Post by Octocontrabass »

8infy wrote:Does real hardware do this as well?
Yes, at least some real hardware does. It's not useful behavior so you may find some hardware that doesn't produce spurious IRQs at all.
8infy wrote:What should I do to avoid this if anything?
You don't have to do anything if you don't mind handling the occasional spurious IRQ.

The only way to completely avoid it is to ensure the timer can never bring its output low before the CPU has accepted the interrupt, either by carefully masking the IRQ in the PIC at the correct times, or by using the PIT in mode 0 instead of mode 3. (You may still receive spurious IRQs from other sources!)

You can significantly reduce the odds of the PIT generating a spurious IRQ by using mode 2 instead of mode 3.
8infy
Member
Member
Posts: 185
Joined: Sun Apr 05, 2020 1:01 pm

Re: Bochs spurious interrupt

Post by 8infy »

Octocontrabass wrote:
8infy wrote:Does real hardware do this as well?
Yes, at least some real hardware does. It's not useful behavior so you may find some hardware that doesn't produce spurious IRQs at all.
8infy wrote:What should I do to avoid this if anything?
You don't have to do anything if you don't mind handling the occasional spurious IRQ.

The only way to completely avoid it is to ensure the timer can never bring its output low before the CPU has accepted the interrupt, either by carefully masking the IRQ in the PIC at the correct times, or by using the PIT in mode 0 instead of mode 3. (You may still receive spurious IRQs from other sources!)

You can significantly reduce the odds of the PIT generating a spurious IRQ by using mode 2 instead of mode 3.
Thanks! :D
Post Reply