spurious interrupts on Bochs

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
giszo
Member
Member
Posts: 124
Joined: Tue Nov 06, 2007 2:37 pm
Location: Hungary

spurious interrupts on Bochs

Post by giszo »

As I had a little free time I tried to continue one of my old projects by compiling the sources and giving the image a try on the latest version of Bochs (2.6.2) and I found a strange problem... Right after my kernel starts to execute a few (5) userspace threads that makes simple looping with printing a number onto the screen, I start to get spurious interrupts on IRQ line 7 periodically.

After going through my sources for a dozen of times I still have no idea what could cause this problem and where should I start debugging it. I would appreciate a little help to track my problem down.
User avatar
xenos
Member
Member
Posts: 1121
Joined: Thu Aug 11, 2005 11:00 pm
Libera.chat IRC: xenos1984
Location: Tartu, Estonia
Contact:

Re: spurious interrupts on Bochs

Post by xenos »

Have you checked the Bochs log? If you enable debug logging for interrupts you should see every single interrupt there.
Programmers' Hardware Database // GitHub user: xenos1984; OS project: NOS
giszo
Member
Member
Posts: 124
Joined: Tue Nov 06, 2007 2:37 pm
Location: Hungary

Re: spurious interrupts on Bochs

Post by giszo »

I will try it out and get back with the results. Thank you for the suggestion.
giszo
Member
Member
Posts: 124
Joined: Tue Nov 06, 2007 2:37 pm
Location: Hungary

Re: spurious interrupts on Bochs

Post by giszo »

After enabling logging of PIC in Bochs I get a lot of messages but none of them seems to be helpful for me.

With some further debugging it seems that the spurious IRQ is generated right after unmasking IRQ #1 (keyboard).
User avatar
xenos
Member
Member
Posts: 1121
Joined: Thu Aug 11, 2005 11:00 pm
Libera.chat IRC: xenos1984
Location: Tartu, Estonia
Contact:

Re: spurious interrupts on Bochs

Post by xenos »

Maybe you could post the message you get - perhaps there is at least some hint what might be going wrong. It would also be good to see the messages you get during PIC setup.
Programmers' Hardware Database // GitHub user: xenos1984; OS project: NOS
jnc100
Member
Member
Posts: 775
Joined: Mon Apr 09, 2007 12:10 pm
Location: London, UK
Contact:

Re: spurious interrupts on Bochs

Post by jnc100 »

One possible cause is acknowledging an interrupt too soon, for example sending EOI before reading the byte from the keyboard. There is a reasonable summary here.

Regards,
John.
giszo
Member
Member
Posts: 124
Joined: Tue Nov 06, 2007 2:37 pm
Location: Hungary

Re: spurious interrupts on Bochs

Post by giszo »

@XenOS - attached the full bochs log.

PIC initialization starts at the following piece in the log:

Code: Select all

00022520249d[PIC  ] IO write to 00a1 = ff
00022520249d[PIC  ] setting slave pic IMR to ff
00022520250d[PIC  ] IO write to 0021 = ff
00022520250d[PIC  ] setting master pic IMR to ff
@jnc100 - My interrupt handling code sends the EOI to the PIC before doing any device specific serving for the fired IRQ. I have used similar IRQ handling method in one of my previous projects without any problems.

As I am experimenting with a microkernel now, the basic sequence of my IRQ handling mechanism is the following:

Code: Select all

- IRQ fires
- mask the fired IRQ on PIC
- send EOI
- deliver the event to the registered server process
- return from the interrupt
...
- the event handler server gets executed
- handling of the IRQ is done here
- the IRQ line on the PIC is unmasked
I do not really have much experience with microkernel design so probably there is some kind of error in my IRQ handling mechanism so feel free to point it out. :D
Attachments
bochs_log.txt
bochs log
(59.69 KiB) Downloaded 206 times
jnc100
Member
Member
Posts: 775
Joined: Mon Apr 09, 2007 12:10 pm
Location: London, UK
Contact:

Re: spurious interrupts on Bochs

Post by jnc100 »

You need to properly handle the interrupt (i.e. deal with the condition that is causing the IRQ line to be high) before sending EOI. This causes problems for microkernels where the code to do such this is often executed asynchronously in a separate process.

To get around this, some microkernels (e.g. QNX) execute the handler immediately on receiving an interrupt with a special handler routine that is executed in the process' address space - note that this essentially operates as a separate thread within the process with its own stack. The kernel will then send EOI only after all handlers attached to a specific IRQ have completed. Another way to do it is to attach small stub IRQ handlers to run in the kernel space to extract the required data from the device, queue up the processing of that by sending a message to a user process, and then send EOI and perform the actual processing (e.g. turning a scan code into a key press/release message) in a separate process at a later time. Windows does something similar with something called a Deferred Procedure Call (although the actual handling can be done in kernel as well as in a separate process, depending on the actual device). Both methods require the actual handler to be relatively quick and/or interruptable.

As far as I remeber minix chooses the slower method of dispatching an IRQ message to the appropriate process, and waiting for an EOI message back before sending EOI requring a number of task switches in the interrupt handling.

Regards,
John.
giszo
Member
Member
Posts: 124
Joined: Tue Nov 06, 2007 2:37 pm
Location: Hungary

Re: spurious interrupts on Bochs

Post by giszo »

Thanks for the good explanation of this area.

In the meantime I changed my sources to deal with the interrupts the same way as you described the Minix one. EOI is sent after serving the fired IRQ. For interrupts handled inside the kernel (e.g. the timer for the scheduler) the EOI is sent to the PIC after the interrupt is served. In the other case when a process is dedicated to serve the IRQ, EOI is sent with the help of a syscall once the process is done with handling the IRQ. For now this (slow) way is more than fine for me. :)

However something is still not good as I do not even have a chance to receive my first IRQ #1 because right after I unmask it in the PIC I get a spurious interrupt. I doubt it would be normal ...
jnc100
Member
Member
Posts: 775
Joined: Mon Apr 09, 2007 12:10 pm
Location: London, UK
Contact:

Re: spurious interrupts on Bochs

Post by jnc100 »

A single spurious IRQ is not usually a problem. It may have been caused by your bootloader. For a slightly simplified example, if the keyboard buffer is filled up, then the IRQ 1 line will go high, which will fill in the relevant bit in the IRR. If the bootloader has not unmasked this in the IMR and relies upon polling instead, then the bit will stay high in the IRR, despite the relevant data being removed from the keyboard buffer and IRQ 1 going low again. Once you unmask the IRQ 1 bit in the IMR and enable interrupts, the PIC will try to deliver this, but realise that the IRQ 1 line is no longer high and deliver a spurious interrupt instead.

Thus its only really indicative of a problem if you have an ever growing count of spurious interrupts. If you do, then you should examine the ISR and IRR in the spurious interrupt handler and see if there's a common pattern. NB if the ISR is empty then do not send an EOI from the spurious IRQ handler.

Regards,
John.
giszo
Member
Member
Posts: 124
Joined: Tue Nov 06, 2007 2:37 pm
Location: Hungary

Re: spurious interrupts on Bochs

Post by giszo »

Actually I have a lot of spurious interrupts that is why I started to worry about it and thinking that something is wrong with my method of handling interrupts. It seems that I get about 2-3 spurious interrupts after each key press.

I know it is hard to give any help without knowing my sources deeply, but do you have any idea what could be still wrong?
jnc100
Member
Member
Posts: 775
Joined: Mon Apr 09, 2007 12:10 pm
Location: London, UK
Contact:

Re: spurious interrupts on Bochs

Post by jnc100 »

What are the values of the PICs IRR and ISR on receipt of a spurious IRQ? You can obtain these by querying the PIC(s).

Regards,
John.
giszo
Member
Member
Posts: 124
Joined: Tue Nov 06, 2007 2:37 pm
Location: Hungary

Re: spurious interrupts on Bochs

Post by giszo »

First of all, I detect spurious interrupts by checking the relevant bit in ISR. If it is not set, I treat it as spurious. Hope this method is correct.

Dumping ISR and IRR values at spurious interrupts gives the following two cases:

Code: Select all

ISR=2 IRR=1
ISR=2 IRR=0
jnc100
Member
Member
Posts: 775
Joined: Mon Apr 09, 2007 12:10 pm
Location: London, UK
Contact:

Re: spurious interrupts on Bochs

Post by jnc100 »

giszo wrote:First of all, I detect spurious interrupts by checking the relevant bit in ISR. If it is not set, I treat it as spurious. Hope this method is correct.
Correct.
Dumping ISR and IRR values at spurious interrupts gives the following two cases:

Code: Select all

ISR=2 IRR=1
ISR=2 IRR=0
This doesn't make much sense to me, as it implies that the spurious IRQ occurs whilst the PIC believes the keyboard interrupt is being serviced.

Could you post a new copy of the bochs log (with the changes to the codebase you've made) as from looking at the bochs sources, each spurious IRQ should have a line where the specific IRQ causing is is raised. i.e. for each spurious IRQ there should be at least one line of the form 'IRQ line x now high' and it should also go low again prior to the spurious IRQ. If you enable the port e9 hack for bochs you could also print a message in the log whenever a spurious IRQ is serviced.

The only other things I could suggest were going over your code and being absolutely sure you do not read from the keyboard port after sending EOI but before re-enabling interrupts. Failing that, try replacing the current hadler with a really simple one (in kernel, with no task switches involved) that simply reads the scancode.

Regards,
John.
giszo
Member
Member
Posts: 124
Joined: Tue Nov 06, 2007 2:37 pm
Location: Hungary

Re: spurious interrupts on Bochs

Post by giszo »

jnc100 wrote:Could you post a new copy of the bochs log (with the changes to the codebase you've made) as from looking at the bochs sources, each spurious IRQ should have a line where the specific IRQ causing is is raised. i.e. for each spurious IRQ there should be at least one line of the form 'IRQ line x now high' and it should also go low again prior to the spurious IRQ. If you enable the port e9 hack for bochs you could also print a message in the log whenever a spurious IRQ is serviced.
Tried to forward my console output to the bochs log with the e9 hack however I can't write into the log file atomically so message flooding coming from PIC debug is injected between the characters of my messages making them totally unusable. :(
jnc100 wrote:The only other things I could suggest were going over your code and being absolutely sure you do not read from the keyboard port after sending EOI but before re-enabling interrupts. Failing that, try replacing the current hadler with a really simple one (in kernel, with no task switches involved) that simply reads the scancode.
In the meantime relocated my IRQ #1 handler into the kernel to totally eliminate userspace IRQ handling and such and the interesting things is that I still got spurios interrupts. :o

ISR and IRR values are now fixed at zero in each case.
Post Reply