Hi,
JAAman wrote:oh, so you mean to run all your ISRs in ring3?
...or disable them altogether
You stated that "
you cannot use ring transitions without this information!". I merely showed that not only was a ring transition possible without this information, but that it's "standard practice" for the SYSCALL instruction.
JAAman wrote:unless there is a way i'm not familier with, you cannot use syscall for ISRs
Of course - I didn't say
all ring transitions can
currently be done without the TSS.
JAAman wrote:i cannot think of any 'cleaner' way it could possibly be implemented -- its just a list of addresses, what about that isn't 'clean' -- it isn't even close to any relation of the PMode TSS (and please don't make my kernel use the user stack!!)
Have you considered what happens when an IRQ or exception occurs while the CPU is running at CPL=3? For a 64 bit OS, it goes something like this:
- - potential cache miss caused by TLB lookup for the page containing the IDT entry
- potential cache miss for reading the IDT entry
- get the IDT entry
- potential cache miss caused by TLB lookup for the page containing the TSS
- potential cache miss for reading ESP0 or an IST entry in the TSS
- get the address of the kernel stack
- potential cache miss caused by TLB lookup for the page of the kernel stack
- potential cache miss for the first "push" to the kernel stack
- push return values on the kernel stack
- potential cache miss caused by TLB lookup for the page containing the interrupt handler
- potential cache miss for the interrupt handler's first instruction
- start executing the interrupt handler
For a 32 bit OS it's worse (you can add potential cache misses for GDT and/or LDT lookup to the list).
Now consider SYSCALL (where the stack isn't changed, and everything else comes from MSRs):
- - push return values on current stack
- potential cache miss caused by TLB lookup for the page containing the SYSCALL handler
- potential cache miss for the SYSCALL's first instruction
- start executing the SYSCALL handler
Due to the huge difference between RAM speed and CPU speed, these potential cache misses are
expensive, which is (IMHO) why SYSCALL has been implemented like it has and why it's so much faster.
By shifting ESP0, ESP1, ESP2 and the IST out of the TSS and using MSRs instead it would prevent 2 potential cache misses for every interrupt. By shifting the IDT into the CPU it'd prevent another 2 potential cache misses. The "worst case" overhead of an interrupt could be halved.
Basically (IMHO), when the 80386 was designed (i.e. when 32 bit protected mode was designed) there wasn't a large difference between RAM speed and CPU speed (AFAIK they both ran at the same speed, no caches where needed and there wasn't any cache miss or RAM access penalties). Things have changed - RAM speed didn't keep up with CPU speed, and the design of 32 bit protected mode (including the TSS, which has been recycled for long mode) isn't really suitable for modern CPUs because of this.
Cheers,
Brendan