General protection fault in 64 bit long mode

untwisted · Post by **untwisted** » Wed Feb 13, 2008 2:29 pm

Howdy all,

I'm working on an OS with some buddies of mine in our spare time. We are making some decent progress, though lately we've been plagued by a GPE that we just can't figure out the cause of.

In our tracing we found that the GPE occurs during an IRETQ from our page fault handler. We first found the page fault to happen during a SYSCALL (in our attempt to get syscall/sysret working), but after padding our function out with some NOPs we found that the page fault is just occurring at that particular memory address. We're guessing that the TLB gets flushed during our SYSRET, and then on the subsequent SYSCALL we're just faulting because the TLB has been flushed.

The page fault occurs in supervised mode, due to page not found on a page read.

The GPE gives us error code 24 which the AMD doc tells us should be the gate descriptor being accessed during the GPE. We checked our GDT and everything appears correct on that end too.

Any suggestions?

Thanks!

01000101 · Post by **01000101** » Wed Feb 13, 2008 2:33 pm

do you test with BOCHS? if so, you might want to look at the dubugging output.

untwisted · Post by **untwisted** » Wed Feb 13, 2008 2:46 pm

Yah, we use bochs but the debug output is fairly useless right now. We really need to get our kgdb stub finished so we can start tracing better.

Combuster · Post by **Combuster** » Wed Feb 13, 2008 4:29 pm

Are you aware of the fact that when a pagefault is called, an error code is pushed onto the stack? If you have left it there the iret will read a corrupt stack and cause another exception (i.e. the GPF)

untwisted · Post by **untwisted** » Wed Feb 13, 2008 8:13 pm

I wasn't aware, I will make some code changes immediately and let you know what happens. Thanks so much!

untwisted · Post by **untwisted** » Thu Feb 14, 2008 9:52 am

I'm sorry, I amend my previous statement. I misread and thought there was a second error code being pushed on the stack. We already account for the error code, and handle it (properly I assume). We have a page fault earlier during execution where we recover just fine and continue execution.

AJ · Post by AJ » Thu Feb 14, 2008 10:41 am

Hi,

Back to Bochs: You don't need the debugger to get a register dump when you close the Bochs main window. Often on a GPF, you get quite a useful message just above the register dump (something like 'Invalid code segment descriptor' or whatever), which will at least help narrow down why Bochs throws the exception.

I must admit, it still sounds very much like a stack alignment problem at the moment.

Cheers,
Adam

untwisted · Post by **untwisted** » Thu Feb 14, 2008 10:58 am

Aha! You're quite right, thanks for that handy tip. I know know: AR byte not writable code segment. I guess you're right, though shouldn't our first page fault have GPF'd as well?

How do I go about making sure the stack is correctly aligned? I tried to add the size of the error code struct to the stack pointer, but that didn't seem to work... Sorry if I'm being dense here

AJ · Post by AJ » Thu Feb 14, 2008 11:10 am

Hi,

As for why the first PFE was handled - I'm not sure, but some things to check:

You have an error code and you possibly push the int # as well. You are in long mode as you are using IRETQ - are you really adding 8 to ESP for each item you push in your ISR preamble (possibly adding 16 to ESP in total)?

Also, look through your ISR ASM stub line by line - do you really have a POP for every PUSH?

Is there any conditional in your ISR which happens sometimes but not all the time? If so, does the conditional branch perform extra stack manipulation for any reason?

I'm not trying to be patronising - I'm going through this because the same things have caused me problems before

Cheers,
Adam

untwisted · Post by **untwisted** » Thu Feb 14, 2008 1:26 pm

I don't find you patronizing at all, I appreciate the help

Here is our ISR code:

Code: Select all

extern(C) void isr_common()
{
	asm
	{
		naked;
		"pushq %%rax";
		"pushq %%rbx";
		"pushq %%rcx";
		"pushq %%rdx";
		"pushq %%rsi";
		"pushq %%rdi";
		"pushq %%rbp";
		"pushq %%r8";
		"pushq %%r9";
		"pushq %%r10";
		"pushq %%r11";
		"pushq %%r12";
		"pushq %%r13";
		"pushq %%r14";
		"pushq %%r15";

		// we don't have to push %rsp, %rip and flags; they are pushed
		// automatically on an interrupt

		"mov %%rsp, %%rdi";
		"call fault_handler";

		"popq %%r15";
		"popq %%r14";
		"popq %%r13";
		"popq %%r12";
		"popq %%r11";
		"popq %%r10";
		"popq %%r9";
		"popq %%r8";
		"popq %%rbp";
		"popq %%rdi";
		"popq %%rsi";
		"popq %%rdx";
		"popq %%rcx";
		"popq %%rbx";
		"popq %%rax";
		
		// A haiku
		// We need to print a stack trace
		// I hate a-s-m
		// This is a job for Jarrett

		// Cleans up the pushed error code and pushed ISR num
		"add $16, %%rsp";

		"iretq";         /* pops 5 things in order: rIP, CS, rFLAGS, rSP, and SS */
	}
}

Maybe I'm missing something in there, but I believe its all correct. I re-read the AMD docs to try to confirm this, but I'll be entirely honest and tell you that they often times I find the docs to be lacking or confusing.

My only other thought is that we possibly set up our TSS incorrectly, which from what I can glean from the doc could be causing this?

untwisted · Post by **untwisted** » Thu Feb 14, 2008 5:00 pm

Well, after some thought and some quick work I've found out a bit more about the page fault. I originally thought it was occurring during a SYSCALL, but I was wrong, its at a SYSRET instead. Our code tries to go in to an infinite syscall/sysret loop just for testing (we have nothing beyond that). I noticed after some moving of code and padding things out with nops that the first sysret is called and works. The following syscall is called and works. We are then dying on the 3rd sysret.

Does the TLB get flushed during the jump from user to supervised mode or vice versa? If this is the case, then it would make sense that the page fault occurs, but why would we be able to exit one page fault (and continue executing code) but then not the next?

I'm going through the Intel docs now since the AMD are fairly worthless in this area. *crosses fingers*

Thanks for the help everyone, so far this site has proven to be a good resource

Combuster · Post by **Combuster** » Thu Feb 14, 2008 6:32 pm

untwisted wrote:Does the TLB get flushed during the jump from user to supervised mode or vice versa? If this is the case, then it would make sense that the page fault occurs, but why would we be able to exit one page fault (and continue executing code) but then not the next?

The TLB caches the page tables. If a page is accessed and it's not in the TLB, then the processor walks through the page tables and fills it. When the TLB is full, an entry is removed from the TLB to make room for another. In real life you won't be bothered much by the TLB as it tries to match whatever is in your pagetables.

Things however, go awry when you change the page tables when a copy of that entry is in the TLB. The TLB will hold the original value while the OS expects the new value to be used. That is the only point when the OS must intervene by manually flushing some TLB entries, or if necessary, the entire TLB.

To optimize for speed, the processor only flushes TLB entries when it needs to - that is, when (the processor thinks) it needs to make room for a new entry, or upon a context switch where an entire new set of paging structures is to be used. A inter-privilege jump does not mean that TLB entries become invalid, so if one entry is flushed at that point it's because of the first reason.

So if the TLB is causing problems, you must have been altering paging structures without letting the processor know, which is a Bad Thing. Otherwise, the error can be determined directly from the page tables where it has too few privileges, or is not mapped at all. In fact, the TLB will only cause a lack of page faults when you'd expect one. I suggest you check the address and the error code of the page fault. Unless you already swap out pages to disk, you should not be getting pagefaults anyway.

untwisted · Post by **untwisted** » Thu Feb 14, 2008 6:54 pm

Well then I have no idea what could be causing this (deadly) page fault. I checked the error code. Its due to a page not present on a page read in supervised mode.

The benign page fault occurs on a page write due to a page not present in supervised mode, but we assumed it to be due to us writing to the video buffer.

So I guess the next question is where do we go from here?

After even more testing I'm now 90% sure that it is the SYSCALL as I had originally said. We must be setting something up incorrectly in regards to syscall/sysret I guess.

Bah!

Combuster · Post by **Combuster** » Thu Feb 14, 2008 7:01 pm

untwisted wrote: It seems as though this chunk of code is the offending one:

C'mon. You can get the location of the offending instruction. It's not THAT hard to be sure.

untwisted · Post by **untwisted** » Thu Feb 14, 2008 7:06 pm

Well, the first line of that code chunk is the offending line, but it doesn't matter WHAT that line is (we originally thought it was a SYSCALL, then someone moved code around and it appeared to be a SYSRET. I padded the function with some NOPs and noticed its just that particular address).

After more testing it HAS to be a problem with the way we're trying to get syscall / sysret working. I *assumed* that we were making it back to supervised mode because thats where the page fault is happening, but I'm not so sure anymore. Basically we're in limbo between our SYSCALL / syscall handler.