Page 1 of 2
General Protection Fault
Posted: Sun Mar 15, 2009 9:35 am
by mangaluve
I still have some problems with my IRQs. When I use the PIT for IRQ0, sometimes I get General protection fault. It doesn't happen on every interrupt, just sometimes when I use a fairly high frequency (about 2000 Hz). Any ideas? Could post some of my int/irq-code if anyone has time to look through it. Everything seems weird.
Re: General Protection Fault
Posted: Sun Mar 15, 2009 11:25 am
by Combuster
It probably means you have a race condition or stack overflow somewhere.
Now for the copy-pasted questions:
What is the faulting instruction? what is the GPF error code? what do the registers contain? what does the stack contain?
Re: General Protection Fault
Posted: Sun Mar 15, 2009 5:38 pm
by mangaluve
Exactly what is a race condition?
Im not very good at debugging this kind of stuff... the error code seems to be 0x13B.. it's really weird because I do a printf in my IRQ-isr, and if I print the error codes, I get the exception mentioned before. However if I just print a string, I dont get any exception. But Im fairly confident that my printf-function is okey, it has always worked fine before. Still got some strange behaviour though..what does the error code mean? When I dont print any digits, just a string, I still get some weird spaces everywhere, almost as if Bochs "erases" some characters on the screen sometime. I wish I could give a better description but everything is just weird. The program continues to run though.. I just get weird print outs.
Re: General Protection Fault
Posted: Sun Mar 15, 2009 6:33 pm
by gzaloprgm
A race condition is an undesirable situation that occurs when a device or system attempts to perform two or more operations at the same time, but because of the nature of the device or system, the operations must be done in the proper sequence in order to be done correctly.
However, I think you are probably making a stack overflow because you are forgetting to pop a register in your ISR.
Try checking that ESP remains the same (or does not increment) in each PIT interrupt.
Cheers,
Gzaloprgm
Re: General Protection Fault
Posted: Mon Mar 16, 2009 3:07 am
by mangaluve
Thanks, I'll try that as soon as I get home from work!
Re: General Protection Fault
Posted: Mon Mar 16, 2009 4:40 am
by xenos
Do you accidentally re-enable interrupts in your timer IRQ handler? I once had this problem because of some useless "STI", causing the timer IRQ handler to be called before it was finished at very high tick rates, which results in a stack overflow.
Re: General Protection Fault
Posted: Mon Mar 16, 2009 7:58 am
by mangaluve
I don't have any STI in my code. I've modified the code a bit (dont really know what I've changed) but I dont get no exception now, but it still behaves weird. I print the ESP in my ISR and it seems to be okay, no overflow. But when I print something in "main" and then something in my ISR, everything gets weird. Could it be a synchronization problem with my printf or something? If it gets interrupted or something?
Edit: I put cli/sti around my printf-statement in my "main" and it worked better. But now I get the General Protection Fault again. I print "a" in my mainloop and I print something like printf("Tick: %d\n", tick++) in my IRQ-handler. Then I get something like:
a
Tick: 0
Tick: 1
aException 13
Tick: 2
Tick: 3
aaTick:4
Exception 13
and so on. But it's kinda weird. My IRQ handler looks like this
Code: Select all
if (interrupt_handlers[t.int_no] != 0) {
isr_t handler = interrupt_handlers[t.int_no];
handler(t);
}
else {
printf("IRQ: %x %x unhandled\n", t.int_no, t.err_code);
asm("hlt");
}
it seems to work, I only get the timer-irq, so the else-statement is Never executed (because the CPU does not halt). However, if I remove the else-statement, then I get the exception after a few ticks. How can that be?! Something must be really wrong...Im playing arround, and adding a 'putchar' somewhere makes the exception occur all the time..
Re: General Protection Fault
Posted: Mon Mar 16, 2009 9:39 am
by Schol-R-LEA
Keep in mind that the HLT instruction only halts until the next external interrupt, so it isn't as if it would simply stop the processor; indeed, many (if not most) OSes have a HLT in their idle loop, rather than simply busy-waiting:
Still, if the else clause were running, you would also expect it to print the warning that the interrupt handler was missing. Hmmmn.... is your kernel printf() function buffered, or does it write directly to the text video page? I assume the latter, and kernel messages should take priority over anything else in any case, but if it is buffered then it may not be getting flushed for some reason, which would make the halt invisible - the next interrupt would cause the program to continue without either handling the interrupt or warning that it went unhandled.
Re: General Protection Fault
Posted: Mon Mar 16, 2009 9:48 am
by mangaluve
My printf writes directly to the video memory. I also have
in the end of my "main-function". It's all just really weird. But right now I've removed all printf-calls except in my ISR and I still get the exception,
Re: General Protection Fault
Posted: Mon Mar 16, 2009 9:58 am
by Schol-R-LEA
Hold on a moment, something just clicked in my head after I posted that first reply. If I am reading this right, you have all of the dispatch entries in the IDT routing to this single interrupt handler, which then does it's own table dispatch off of interrupt_handlers[], correct? But you still have the interrupt service routines returning from the interrupt directly using an IRET, correct? If so, there's your stack overrun - you're creating an extra stack frame for the interrupt handler function which is then never cleared, and the IRET ends up jumping off into never-never land. It is similar to (if somewhat more elaborate than) the problem described
here.
If I am right, then the solution is to change you isr_t functions to return to the dispatcher, which then would end with the IRET. Either that, or have the interrupt system patch the isr_t functions into the IDT directly, which would also avoid having two jump tables (one of which has the same value in every entry) and extra overhead for a dispatch that is normally done in hardware.
Of course, I may be misunderstanding your design entirely.
Re: General Protection Fault
Posted: Mon Mar 16, 2009 10:09 am
by mangaluve
I dont really understand what you mean. I return from my interrupt function (the one in the interrupt_handlers[] vector). It's something like this (where my IDT points to asm_isr)
Code: Select all
asm_isr:
push some registers and stuff
call isr_handler
pop the registers and stuff
iret
void isr_handler(registers_t t) {
if (interrupt_handlers[t.int_no] != 0) {
isr_t handler = interrupt_handlers[t.int_no];
handler(t);
}
else {
printf("IRQ: %x %x unhandled\n", t.int_no, t.err_code);
}
}
so the only IRET I do is from my asm-isr. Isnt this correct?
Edit: I changed to the following code (without the table)
Code: Select all
void irq_handler(registers_t t) {
putchar('A');
printf("IRQ: %x %x unhandled\n", t.int_no, t.err_code);
PIC_sendEOI(0);
putchar('B');
}
It does not work, I get an exception after a couple of IRQs. However, it seems to work if I remove the putchars
Re: General Protection Fault
Posted: Mon Mar 16, 2009 10:17 am
by Schol-R-LEA
On a similar note... when you say that you don't re-set the interrupts, do you mean at all, anywhere in the interrupt handler? Does this mean that you aren't clearing them, either? That's a problem - you need to have the entire interrupt handler (not just the isr_t) blocking, with CLI as the first thing when it begins and SLI as the last thing (or one of the last things) before the IRET. Otherwise, an interrupt could come in the middle of the ISR, which would be a Bad Thing for obvious reasons.
Re: General Protection Fault
Posted: Mon Mar 16, 2009 10:19 am
by mangaluve
Sorry, my code above is of course simplified (actually I took most of my interrupt-code from james molloys tutorial). I make a CLI the first thing I do in my ISR (the one that is set in the IDT) and a STI just before IRET. I also (of course) reset the PIC to acknowledge it IRQ.
As a note, I use 1 000 000 as the value of ISP in my settings file for Bochs. I guess this means one million instructions per seconds? I also try to set my PIT to the frequency 20 000, so I guess there are new IRQs while the ISR for the last one is running. However, this shouldnt be a problem due to CLI/STI.
Re: General Protection Fault
Posted: Mon Mar 16, 2009 10:23 am
by Schol-R-LEA
Ah, OK - you answered my first question while I was posting the second. Yes, the code you've got there should be right in regards to the stack handling - I had been under the impression that you had the IRET in the individual isr_t functions, but I see now that the dispatch function is itself called from the actual ISR, and all of the functions do in fact return to it properly AFAICT. That should be correct.
Re: General Protection Fault
Posted: Mon Mar 16, 2009 10:25 am
by mangaluve
Well I've always thought that my code works (it works for keyboard / software interrupts), but I've never used the timer with this high frequency.. The problem is that Im not really good at debugging with bochs, so I don't really know what's happening.