Page 1 of 1

Spurious Page Fault

Posted: Thu May 05, 2011 10:32 pm
by ghartshaw
I'm writing a shared_ptr class for my C++ protected mode kernel, but the kernel randomly page faults (cr2=0x98c30008 error_code=0) when I add a variable to the class and I have no clue where to even start looking for the underlying problem. Does anyone have any ideas?

Code: Select all

EAX=0000001e EBX=0000000e ECX=000b8000 EDX=00000004
ESI=00000002 EDI=001298ad EBP=00000000 ESP=00107d94
EIP=00100b3e EFL=00000016 [----AP-] CPL=0 II=0 A20=1 SMM=0 HLT=1
ES =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
CS =0008 00000000 ffffffff 00cf9a00 DPL=0 CS32 [-R-]
SS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
DS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
FS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
GS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
GDT=     00108080 00000028
IDT=     00108500 00000800
CR0=80000011 CR2=98c30008 CR3=00128000 CR4=00000000
DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000 
DR6=ffff0ff0 DR7=00000400
EFER=0000000000000000
FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=00000000000000000000000000000000 XMM01=00000000000000000000000000000000
XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000
XMM04=00000000000000000000000000000000 XMM05=00000000000000000000000000000000
XMM06=00000000000000000000000000000000 XMM07=00000000000000000000000000000000

Re: Spurious Page Fault

Posted: Fri May 06, 2011 1:02 am
by Brendan
Hi,

Do you mean "spurious", or "unexpected"?

If it's not spurious (e.g. you can find the reason by examining the corresponding page table entry, page directory entry, etc) then there's a bug in your code. The error code says the page fault was caused by a "not present" page, so either you didn't map a page when you should have, or you tried access something that you shouldn't have.

If it is spurious (e.g. occurs for no apparent reason) then your kernel failed to invalidate TLBs properly.

The first step would be to determine which of these is happening.


Cheers,

Brendan

Re: Spurious Page Fault

Posted: Fri May 06, 2011 11:58 am
by ghartshaw
It's really spurious (and I realized that I wasn't invalidating the TLB). Unfortunately, when I added that code in, QEMU stopped booting my kernel (hangs at main GRUB screen, see my other thread).

Re: Spurious Page Fault

Posted: Fri May 06, 2011 5:16 pm
by Yargh
Is it actually hanging, or is the screen not updating? Because, from many other threads I have seen, if you do a

Code: Select all

hlt
then qemu will not update the screen anymore.

Also, I had the same exact problem where it hung at the GRUB screen, which was resolved by switching to syslinux which loaded the same code correctly.

Re: Spurious Page Fault

Posted: Fri May 06, 2011 10:41 pm
by ghartshaw
Turns out it triple faulted, and that's what QEMU displayed on the screen when it reset (but it's not what was last printed by my OS, strange). Now I just need to track down why it's triple faulting (I thought I had my IDT installed correctly, guess not).