Detecting exception error code

Ryu · Post by **Ryu** » Wed Jul 12, 2006 12:27 pm

I was wondering if there is a way to know during an exception when theres an error code pushed onto stack? An example would be hardware triggered GPFs which would have an error code in stack while a software generated GPF does not, being that its completely legal to software GPF, I need to handle those circumstances. This is mainly to make a generic exception routine to dump debugging information.

JAAman · Post by **JAAman** » Wed Jul 12, 2006 2:59 pm

by software GPF i assume you mean a soft-int? -- you should not allow exceptions to be generated this way, as there is no reason to do this -- its quite easy to enforce -- just set the DPL in the IDT to kernel only, and user code cannot soft-int that entry (exceptions, however, will always be processed -- regardless of ring)

other than that, the only thing you can do (afaik) is look at the stack, and find which values make sence

Ryu · Post by **Ryu** » Wed Jul 12, 2006 11:09 pm

you should not allow exceptions to be generated this way, as there is no reason to do this

The point is if I can detect theres an error code, I can make a generic routine for every exception when they are "unhandled". If possible, I then have a reason to software interrupt when I need to acquire debugging information at a certain runtime point. Ocourse you can always make another routine to do this but this is more avoiding my question.

other than that, the only thing you can do (afaik) is look at the stack, and find which values make sence

Well as I said, I like to make this generic debugging routine for all exceptions. What makes sense for one exception's error code won't be necessary the same for another. The only solution if I were to do this is to either detect if theres an error code, or have a parameter in my routine to adjust the stack pointer to correctly dump stack information at the state before the exception occured.

I'm also designing an "operating mode" in my OS that runs software in CPL0. The goal there is to make this stable as possible, so being able to detect a software interrupt on the reserved interrupts can be useful, but this itself would be a long explaination and I don't really feel like getting into.

All and all, I do know which route to take if I can't detect this which isn't very trivial. I'm really trying to open some doors here..

Pype.Clicker · Post by **Pype.Clicker** » Thu Jul 13, 2006 7:51 am

Ryu wrote:
you should not allow exceptions to be generated this way, as there is no reason to do this
The point is if I can detect theres an error code, I can make a generic routine for every exception when they are "unhandled".

it is not possible, afaik. Nothing could differentiate the error code from another item on the stack. If you want a "generic" exception handler (e.g. taking care of with-error-code exceptions as well as without-error-code ones), just push an null error code on the top of the stack in the "pre-handler" of the exception, e.g. what your IDT points to is some code like

Code: Select all

exc_00_stub:
    push 0xdeadbeef  ; fake error code
    push 0x00             ; exception number
    jmp generic_stub

...    

exc_0d_stub:
    ; we already have error code
    push 0xd
    jmp generic_stub

...

JAAman · Post by **JAAman** » Thu Jul 13, 2006 8:20 am

Well as I said, I like to make this generic debugging routine for all exceptions. What makes sense for one exception's error code won't be necessary the same for another.

i think you missunderstood me -- when i said look for values that make sence, i was not talking about the error code -- look for values that make sence for CS (as only a few values should be correct)

like this:

get value of CS with error-code (should be esp+8) -- if you have an error code, this will be a valid CS selecor; if there is no error-code, this will be the flags register (which will always have bit 1 set, bits 3,5, and 15-31 clear)

Combuster · Post by **Combuster** » Thu Jul 13, 2006 3:32 pm

Alternatively, you might want to compare the stack pointer value to that of the top of stack and see how many values have been pushed and determine that way wether there is an error code or not.
Obviously it would not work if it is the kernel that faults since no stack change occurs on an ring0-ring0 interrupt call, but you can try abovementioned method when there seem to be 'too many' error codes on the stack

Ryu · Post by **Ryu** » Thu Jul 13, 2006 4:30 pm

JAAman wrote:
i think you missunderstood me -- when i said look for values that make sence, i was not talking about the error code -- look for values that make sence for CS (as only a few values should be correct)

like this:

get value of CS with error-code (should be esp+8) -- if you have an error code, this will be a valid CS selecor; if there is no error-code, this will be the flags register (which will always have bit 1 set, bits 3,5, and 15-31 clear)

Okay.. well I've thought about this awhile ago, its possible that the CS value in stack can "make sense" to be a EFLAG. Take the following small table:

5 3 1 Hex
00 00 10 2 <-- Not possible
00 00 11 3 <-- Not possible
00 01 10 6 <-- Possible (Selector 0h LDT RPL 2)
00 01 11 7 <-- Possible (Selector 0h LDT RPL 3)
01 00 10 12 <-- Possible (Selector 10h GDT RPL 2)
01 00 11 13 <-- Possible (Selector 10h GDT RPL 3)
.......
The ones indicated that are possible shows that its also a valid EFLAG value.

Combuster wrote: Alternatively, you might want to compare the stack pointer value to that of the top of stack and see how many values have been pushed and determine that way wether there is an error code or not.
Obviously it would not work if it is the kernel that faults since no stack change occurs on an ring0-ring0 interrupt call, but you can try abovementioned method when there seem to be 'too many' error codes on the stack

Yeah *sigh*, but the case is ring0->ring0 software interrupts that is the issue. The other method is using a TSS for exceptions which will force a stack switch but it seems to be an overkill.

JAAman · Post by **JAAman** » Fri Jul 14, 2006 7:52 am

The ones indicated that are possible shows that its also a valid EFLAG value.

uh...
thats not true:

5 3 1 Hex
00 00 10 2 <-- Not possible
00 00 11 3 <-- Not possible
00 01 10 6 <-- Possible (Selector 0h LDT RPL 2)
00 01 11 7 <-- Possible (Selector 0h LDT RPL 3)
01 00 10 12 <-- Possible (Selector 10h GDT RPL 2)
01 00 11 13 <-- Possible (Selector 10h GDT RPL 3)

but selector 0 is not allowed (by definition -- null selector)
did you ever think of making selector 10h a data selector? -- nothing requires you to use that particular selector (of course if you allow many selecters, the pattern will repeat -- but most dont allow user generated selectors)

Ryu · Post by **Ryu** » Fri Jul 14, 2006 3:46 pm

JAAman wrote: 00 01 10 6 <-- Possible (Selector 0h LDT RPL 2)
00 01 11 7 <-- Possible (Selector 0h LDT RPL 3)
...
but selector 0 is not allowed (by definition -- null selector)

Those two are LDT (TI=1) segment selector 0 and is allowed. The null selector you are probably thinking is the GDT one.

JAAman wrote:did you ever think of making selector 10h a data selector? -- nothing requires you to use that particular selector (of course if you allow many selecters, the pattern will repeat -- but most dont allow user generated selectors)

Okay going back rethinking, if we try to make sense from whats in [esp+8h]

-hardware generated exceptions will read the return segment
-a software interrupt in the reserved interrupts ring0 will read as EFLAGS
-a software interrupt in the reserved interrupts in other then ring0 is not allowed (probably generates a GPF which will have and error code pushed onto stack).

The problem is only within ring0 that allows the software interrupt which those cases EFLAGS is read at [esp+8h], you cant determin if its a valid EFLAGS given that the generated source can be from ring 2 and 3 (matching the reserved bit 1 in EFLAGS). You can however, if it happends to read the return segment you can easily determin its not EFLAGS because your code segments must be ring0 to have it software generated, but thats not the case. :-\

The way I have decided to do this for now, is set the AC=1 (alignment check) which resides in the higher 16bits of EFLAGS. It should always be on in ring0 (it shoudn't have an effect anyways in ring0), this way I can tell that its a EFLAGS or not simply by checking if anything is in the higher 16bits. And rings 1,2,3 are not to be worried as attempting to software interrupt would just generate a GPF anyway.

Any opinions welcome..

edit: Oh I should correct this point: -hardware generated exceptions will read the return segment only on exceptions that has error codes. The ones that don't can be easily handled hardwared or software generated as what Pype.Clicker mentioned, which I'll just always push a fake error code.

blip · Post by **blip** » Fri Jul 14, 2006 7:06 pm

I don't like the idea of depending on the state of something like this to differentiate what is what, it all seems very hacky. It's never a good idea for software to depend on reserved bit values. You never know when or if they'll change.

JAAman · Post by **JAAman** » Sun Jul 16, 2006 10:31 am

-a software interrupt in the reserved interrupts in other then ring0 is not allowed (probably generates a GPF which will have and error code pushed onto stack).

by reserved interrupts i assume you mean exceptions? soft-ints in exceptions are perfectly valid -- a GPF will trigger on a soft-int only if the IDT descriptor is set to only allow it in a higher ring

Those two are LDT (TI=1) segment selector 0 and is allowed. The null selector you are probably thinking is the GDT one.

yes your right -- i didnt think of the LDT since i use flat mode (there is no LDT) however, you could easily enforce a null segment in the LDT as well...

this does make it easier for me to do this (though i dont) -- there are only 2 possible CS values (kernel code, and user code)

-a software interrupt in the reserved interrupts ring0 will read as EFLAGS

no -- it will read as CS -- as far as i can tell, the only time it will read as eflags, is if it has an errorcode -- which will always be an exception

Ryu · Post by **Ryu** » Sun Jul 16, 2006 1:21 pm

blip wrote: I don't like the idea of depending on the state of something like this to differentiate what is what, it all seems very hacky. It's never a good idea for software to depend on reserved bit values. You never know when or if they'll change.

Yes I have to agree that depending on the reserved bits is just a door to trouble in the future. Theres only one reserved bit that just my hunch that will remain there for a very long time or my guess forever, which is the bit 1 in EFLAGS. For setting AC=1 isn't relying on any reserved bits at all, and really is no harm either way you turn this on or off, and technically I'm not relying on this bit to remain on (it should however), because its merely just a debugging case, if AC happends to be set and identified under a software interrupt generated case, then it happends so it can recover or output information correctly.

JAAman wrote:by reserved interrupts i assume you mean exceptions? soft-ints in exceptions are perfectly valid -- a GPF will trigger on a soft-int only if the IDT descriptor is set to only allow it in a higher ring

Well what I really mean was something in the long the lines of "intel reserved software interrupting" if that makes it more clear.

JAAman wrote:-a software interrupt in the reserved interrupts ring0 will read as EFLAGS
...
no -- it will read as CS -- as far as i can tell, the only time it will read as eflags, is if it has an errorcode -- which will always be an exception

Thats wrong, if you do the math, specifically when theres no error code (theres always no error code if you software interrupt regardless if its software intel reserved interrupted), [esp+8h] would point to EFLAGS in stack. To make this clear you can see it like:

When theres an error code..

[esp+Ch] - flage state before interrupt or exception (although not exactly on exceptions, RF may be set in this state)
[esp+8h] - return segment
[esp+4h] - return offset
[esp+0h] - error code

When theres no error code..

[esp+8h] - flage state before interrupt or exception
[esp+4h] - return segment
[esp+0h] - return offset

JAAman wrote:this does make it easier for me to do this (though i dont) -- there are only 2 possible CS values (kernel code, and user code)

You can scrap this now.. and yes, it is easy when you have [esp+8h] to be the return segment and identify it to be a software interrupt (because it must be a ring0 segment) when your expecting [esp+8h] to be a segment. However, in this case we got EFLAGS, which we need to determin if its not a valid EFLAGS which is a completely differnt story.

I've tried anyways, to make a point that in the circumstance only within ring0 the exception handler will find a software interrupted case, (and ofcourse EFLAGS would be the value taken at [esp+8h] not your return segment), in normal cases such as within ring3 code, the handler is not able to identify that its not an EFLAGS because the return segment would be RPL 3 matching the reserved bit 1 in EFLAGS. This makes things much more trivial then expecting a return segment in [esp+8] under those conditions.. its really hard to explain this which has several factors involved. But its probably because my grammer is just so awful.

JAAman · Post by **JAAman** » Tue Jul 18, 2006 10:28 am

When theres an error code..

[esp+Ch] - flage state before interrupt or exception (although not exactly on exceptions, RF may be set in this state)
[esp+8h] - return segment
[esp+4h] - return offset
[esp+0h] - error code

When theres no error code..

[esp+8h] - flage state before interrupt or exception
[esp+4h] - return segment
[esp+0h] - return offset

which is exactly what i said

if there is an error code, it is the flag state before the interrupt, if there is no error code, it is the source CS

You can scrap this now.. and yes, it is easy when you have [esp+8h] to be the return segment and identify it to be a software interrupt (because it must be a ring0 segment) when your expecting [esp+8h] to be a segment. However, in this case we got EFLAGS, which we need to determin if its not a valid EFLAGS which is a completely differnt story.

i see no problem -- all you have to do is ensure that the only valid CS is not a valid eflags! -- quite simple to do...

Ryu · Post by **Ryu** » Tue Jul 18, 2006 3:39 pm

JAAman wrote:which is exactly what i said

if there is an error code, it is the flag state before the interrupt, if there is no error code, it is the source CS

What I said is reversed to what you implied twice now and is wrong, please read it a bit more carefully.

JAAman wrote:i see no problem -- all you have to do is ensure that the only valid CS is not a valid eflags! -- quite simple to do...

It would be simple like I said over two post under that circumstance (but not the case here) are based on these points, which I think pretty much is your assumptions.

1) If theres NO error code, [esp+8h] is the return segment (CS).
2) Its easily can be identified as the ring0 segment simply by checking bits 0 and 1 in [esp+8h] are both off, effectively will indicate when theres an error code or not, and also determins if the exception handler was software generated or an actual exception.

Is it simple? ofcourse.. but thats not the case, these are the correct points:

1) If theres NO error code, [esp+8h] is EFLAGS in stack.
2) Hard to identify because its EFLAGS, and because the logic is you need to detect point (1) cases to differentiate them (regarding return segments can be from ring 2 or 3).

If you still think this is simple, try coding it, if it is so simple it shouldn't be much effort to do.

JAAman · Post by **JAAman** » Thu Jul 20, 2006 10:39 am

your right about that.. i got it backwards -- but it doesnt make any difference

2) Its easily can be identified as the ring0 segment simply by checking bits 0 and 1 in [esp+8h] are both off, effectively will indicate when theres an error code or not, and also determins if the exception handler was software generated or an actual exception.

dont tell me im assuming something that i didnt even think about -- you obviously didnt get what i meant

what i meant is in my OS, there is only 2 possible CS values (say 0x10 (kernel CS) and 0x1B (user CS --entry 0x18+ring3) -- it would be easy to change this -- or add more CS values for rings 1&2)
then you try:

if (esp+8) == 0x10
call withErrorCode
if (esp+8) == 0x1B
call withErrorCode
call withoutErrorCode // if you get here, then it cannot be CS

OSDev.org

Detecting exception error code

Detecting exception error code

Re:Detecting exception error code

Re:Detecting exception error code

Re:Detecting exception error code

Re:Detecting exception error code

Re:Detecting exception error code

Re:Detecting exception error code

Re:Detecting exception error code

Re:Detecting exception error code

Re:Detecting exception error code

Re:Detecting exception error code

Re:Detecting exception error code

Re:Detecting exception error code

Re:Detecting exception error code

Re:Detecting exception error code