Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
I've been wondering about this for a while. Is there any need to 'acknowledge' CPU interrupts handled by ISR's? When I receive a CPU interrupt (and only a CPU interrupt, so interrupts ranging from 0 - 31), the interrupt keeps repeating itself, causing the handler to be called over and over again, hogging up the entire operating system (IRQ's can still be received though). This makes me think there is something wrong with my ISR handler, but I've checked the documentation and read a few websites (like JamesM's, Bran's, the wiki) to see if I did something wrong, alas. The IRQ handlers work fine (probably because master and slave are acknowledged properly).
Any idea's what could cause an infinitely looping CPU interrupt handler? My apologies if this is a stupid question .
Thanks in advance,
Creature
When the chance of succeeding is 99%, there is still a 50% chance of that success happening.
If the return IP == the offending code, then if you IRET, the handler will call again since it returns to the bad code.
I'm having a problem with my IRQs, hehe, and it's pissing me off. The timer interrupt won't call!
Solar wrote:It keeps stunning me how friendly we - as a community - are towards people who start programming "their first OS" who don't even have a solid understanding of pointers, their compiler, or how a OS is structured.
If so, depending on the type (trap or fault) the return EIP on the stack is the faulting EIP or the EIP of the next instruction. If it's the faulting EIP then you'll get an exception loop.
pcmattman wrote:It depends - are the ISRs exceptions?
If so, depending on the type (trap or fault) the return EIP on the stack is the faulting EIP or the EIP of the next instruction. If it's the faulting EIP then you'll get an exception loop.
Yes, like division-by-zero exceptions, page-faults (which is more of a fault), GPF's, ... they all infinitely loop. The IRQ's work fine, however.
So basically, if a process throws an exception, the best thing to do is let the user of the OS know the process 'crashed' somewhere and then shut that process down? In my case there is only one process, and that is the kernel itself, so if it page-faults, my iret will keep bringing me back to the moment before it crashed, where it will just crash again?
When the chance of succeeding is 99%, there is still a 50% chance of that success happening.
Things like a division by zero exception are a fault (see Volume 3A of the Intel Manuals - section 5). This means for all intents and purposes the error is unrecoverable.
A page fault is an exception to the rule - you can map in the faulting page - this is how things like copy-on-write work.
So basically, if a process throws an exception, the best thing to do is let the user of the OS know the process 'crashed' somewhere and then shut that process down?
A fault is a killed process - except for the special page fault case.
A trap may return (for instance, interrupt 3 - Breakpoint exception, which returns to the next instruction after the instruction triggering the breakpoint).
So basically I shouldn't try to recover from a division-by-zero exception? What would be the best thing to do when it occurs? Let the user know about it, shutdown the kernel, disable interrupts and halt the processor (something like panic?).
When the chance of succeeding is 99%, there is still a 50% chance of that success happening.
well
you are outh to ( depending on what operation you are doing(if it is a calc the user does))
however i had always thougt about; why not detect what register is the DIV with 0, and change the destance register to 0, and continue execution ?( if it is the os handling the divide by zero error)
KMT dk
well, what to say, to much to do in too little space.
when it goes up hill, increase work, when it goes straight, test yourself but when going down, slow down.
Solar wrote:It keeps stunning me how friendly we - as a community - are towards people who start programming "their first OS" who don't even have a solid understanding of pointers, their compiler, or how a OS is structured.
however i had always thougt about; why not detect what register is the DIV with 0, and change the destance register to 0, and continue execution ?( if it is the os handling the divide by zero error)
Tell me one case where dividing by zero is behavior you expect. There's a reason you get the fault, and that's because there's a bug in the code somewhere. Also, continuing execution means you need to jump to the next instruction, which means you'll need to keep the sizes of all relevant instructions stored somewhere. Not to mention you'll hinder your own debugging if your exception handlers try to fix the problem for you.
I know it sounds really nice to have exception handlers that try to fix the problem (I have thought about it myself), but the error is too context-sensitive to be able to restore an appropriate state.
It's easy enough to just dump the registers, kill the faulting thread(/process) and continue. For extra debugging help you could even do a backtrace
You may also want to report the error to the program using signals or some similar construct. That way if the program can retry whatever that thread was doing before it caused the exception, it can.
JohnnyTheDon wrote:You may also want to report the error to the program using signals or some similar construct. That way if the program can retry whatever that thread was doing before it caused the exception, it can.
You could just kill the process that caused it. (Unless it was in the kernel of course.)
I know, but you may want to allow the program to try to recover if it can. If it can't recover, or if it chooses not to attempt to recover, then the process will be killed.
I know, but you may want to allow the program to try to recover if it can
I personally feel that's a step you take once you're ready to release to the public, rather than during development when you are trying to debug stuff.
But, @Creature: it's up to you how you choose to go about this, we can only give our own opinions (or quote the manuals)
I know, but you may want to allow the program to try to recover if it can
I personally feel that's a step you take once you're ready to release to the public, rather than during development when you are trying to debug stuff.
But, @Creature: it's up to you how you choose to go about this, we can only give our own opinions (or quote the manuals)
Yes, you are right. But it does sound kind of lame to have to panic the entire kernel just because a divide-by-zero exception occurred. But then again, they don't occur in the kernel process or shouldn't occur as I'm the only one programming that process . If someone else (accidentally) causes a division-by-zero exception, I should probably best kill the process that caused it, possibly giving a debug dump with the registers and such.
Thanks for the information .
When the chance of succeeding is 99%, there is still a 50% chance of that success happening.