Page 2 of 2

Re:GPF on jum to less privalged code

Posted: Wed May 24, 2006 4:45 am
by Pype.Clicker
ed_tait wrote:

Code: Select all

ltr [30h] ; tss selector
if so i get a gpf:

fetch_raw_descriptor: GDT: index (ff57)1fea > limit (37)
You may want to check your assembler's manual again. ltr [30h] terribly sounds like "get word at address DS:0x30. That will be a 16-bit selector to load the task register with".

What you want is more

Code: Select all

mov eax,0x30
ltr ax

Re:GPF on jum to less privalged code

Posted: Wed May 24, 2006 9:17 pm
by Brendan
Hi,
mystran wrote:As for hardware switching, somebody that uses it can tell, I've forgot all about it (on purpose, really).
For hardware task switching it's mostly the same. In both cases, during boot you build initial scheduling structures (including the TSS) around the running code and do LTR once. After that the scheduler can ignore the TR register (because the TR is a constant for software task switching, or because the CPU automatically updates it as part of a "JMP_to_TSS" hardware task switch).

ed_tait wrote:if you use sack based task switching do you need a tss or can you store esp0 in another way?
To be 100% accurate, you must have a correct value for SS0 and ESP0 in a valid TSS every time you switch from CPL=3 to CPL=0. The same applies when switching to any "more privileged" level (e.g. you need SS1 and ESP1 to switch from CPL=3 or CPL=2 to CPL=1).

This applies regardless of which task switching method you use.

The only way to avoid the need for SS0 and ESP0 (and SS1/ESP1 and SS2/ESP2) is to avoid switching from a less privileged level to a more privileged level. For example, an OS like Singularity wouldn't need SS0/ESP0, SS1/ESP1, SS2/ESP2 or a TSS because everything runs at CPL=0.

paulbarker wrote:The correct places to use TSS's in stack based switching are 1 per processor holding esp0/ss0, one for a double fault handler and one for an NMI handler.
For double faults, that information probably comes from Intel's "Interrupts and Exceptions" chapter. Most OS developers prefer to write their exception handlers so that the double fault handler doesn't need to use an "interrupt task gate" (i.e. make sure that other exception handlers can't crash).

For NMI there isn't really any need to use an "interrupt task gate" that I can see...
paulbarker wrote:I know this is going off topic but if I have 1 double fault handler for the entire system, what would happen if a processor tried to doube fault at the same time as another (so the busy flag for the TSS is set)? Would this cause a triple fault (reset) or would the 2nd processor wait for the first to finish?
The second processor will try to generate a general protection fault (which would cause a triple fault). The same applies to NMIs if you're using "interrupt task gates", but here the general protection fault can be handled within the general protection fault handler (it won't immediately cause a double fault or triple fault).


Cheers,

Brendan

Re:GPF on jum to less privalged code

Posted: Thu May 25, 2006 4:25 am
by paulbarker
For double faults, that information probably comes from Intel's "Interrupts and Exceptions" chapter. Most OS developers prefer to write their exception handlers so that the double fault handler doesn't need to use an "interrupt task gate" (i.e. make sure that other exception handlers can't crash).
Say we overflow a kernel stack, causing a page fault (an unmapped page will be stored immediately below the stack to catch this). The page fault handler tries to write it's data to the stack, causing a double fault (since the stack is invalid). Only by using a separate stack can this double fault be handled properly, and that requires a TSS.

The page fault handler writing to the stack cannot be avoided, it is done by hardware pushing CS:EIP and EFLAGS rather than software.
For NMI there isn't really any need to use an "interrupt task gate" that I can see...
Personally I don't really know much about NMI at the moment. That comment was based on things I have read in a couple of places, and the fact that an NMI could happen while the kernel data structures (including page tables and stack) are in an inconsistant state.

Re:GPF on jum to less privalged code

Posted: Thu May 25, 2006 7:00 am
by Brendan
Hi,
paulbarker wrote:
For double faults, that information probably comes from Intel's "Interrupts and Exceptions" chapter. Most OS developers prefer to write their exception handlers so that the double fault handler doesn't need to use an "interrupt task gate" (i.e. make sure that other exception handlers can't crash).
Say we overflow a kernel stack, causing a page fault (an unmapped page will be stored immediately below the stack to catch this). The page fault handler tries to write it's data to the stack, causing a double fault (since the stack is invalid). Only by using a separate stack can this double fault be handled properly, and that requires a TSS.

The page fault handler writing to the stack cannot be avoided, it is done by hardware pushing CS:EIP and EFLAGS rather than software.
Of course, but if you're writing a micro-kernel and your kernel stack needs to be more than 4 KB then you're doing something wrong, and if you're writing a monolithic kernel there are other ways, like using different stacks for IRQ handlers (like Linux does sometimes) or pre-checking kernel stack requirements before doing anything that could exceed the remaining stack space.

I guess what I'm saying is that mixing software task switching with hardware task switching complicates things, and the complications can be avoided.
paulbarker wrote:
For NMI there isn't really any need to use an "interrupt task gate" that I can see...
Personally I don't really know much about NMI at the moment. That comment was based on things I have read in a couple of places, and the fact that an NMI could happen while the kernel data structures (including page tables and stack) are in an inconsistant state.
To be honest, I also think I should know more about NMIs. Good information on the causes of NMIs is hard to find, and I doubt there's many people who feel like they know enough.

The following is based on my own research into chipsets (and some unfortunate guess-work to fill in the gaps), and shouldn't be considered "100% correct"...

AFAIK for older computers NMI was used for RAM errors and unrecoverable hardware problems. For newer computers these things may be handled using machine check exceptions and/or SMI. For the newest chipsets (at least for Intel) there's also a pile of TCO stuff ("total cost of ownership") that is tied into it all (with a special "TCO IRQ" and connections to SMI/SMM, etc). Somehow all of the TCO stuff is/can be connected to an onboard ethernet controller, and (at least part of it) is intended for remote monitoring of the system. Unfortunately the chipset documentation I've been reading can't tell me how BIOSs normally configure the chipset, and the chipsets themselves support several different options in each case. For example, for a RAM error it could be handled by the chipset itself, it could generate an SMI (where the BIOS/SMM handler does "RAM scrubbing" in software), it could generate a "TCO interrupt", etc. If you add it all up it's a huge complex mess (TCO + SMI + SMBus + northbridge + PCI bus/controller/s + PCI-to-LPC-bridge + god-knows-what) that can be completely different between motherboards (even motherboards with the same chipset).

The short version of this story is that there's only really 2 reasons for an NMI. The first reason is a hardware failure. The second reason is a "watchdog timer", which can be used to detect when the kernel itself locks up (and is sometimes also used for more accurate profiling as it allows EIP to be sampled even when IRQs are disabled).

If a hardware failure caused an NMI then there's no way to figure out which piece of hardware caused the NMI. In this case I'd try to do the least possible in an attempt to tell the user that a hardware failure occured, but at the end of the day you can't expect any OS to work sanely on faulty hardware and there's nothing software can do to work around the hardware failure anyway.

For the watchdog timer, it must be setup by the OS first. This can actually be done even when the chipset itself doesn't have a special watchdog timer for it (e.g. setting the PIT, RTC/CMOS IRQ or a HPET IRQ to "NMI, send to all CPUs" in the I/O APIC). In this case you want the watchdog timer to be fast (i.e. no slow hardware task switching and cache flushing) and you'd also want all CPUs to share the same timer, which means all CPUs would receive the same IRQ at the same time (which brings me back to the busy flag in your TSS).

As an alternative, you could also use the local APIC's timer or the performance monitoring counter overflow for a "per CPU" watchdog timer. Unfortunately these things are usually used for other purposes.


Cheers,

Brendan

Re:GPF on jum to less privalged code

Posted: Thu May 25, 2006 7:30 am
by ed_tait
Hooray!!!!!!

i've got it to work :) and can now switch to all privalage levels!!

thank you for all the help.