Page 1 of 1

Triple fault after enabling paging

Posted: Tue Apr 04, 2017 1:15 pm
by 0xBADC0DE
I've been following the higher half kernel tutorial. Everything works fine, except for the part when paging is enabled. This is the specific code that causes a triple fault:

Code: Select all

mov ecx, cr0
or  ecx, 0x80000000
mov cr0, ecx
I'm confused about what to set ECX to, because in the higher half kernel x86 bare bones tutorial (which is what I have followed), the code

Code: Select all

or  ecx, 0x80000000
is used, but in the http://wiki.osdev.org/Paging#Enabling article it says to use the code

Code: Select all

or eax, 0x00000010
The paging article says that a GPF will be generated if the PG bit is set and the PE bit is cleared. I have tried both

Code: Select all

0x80000000
and

Code: Select all

0x80000001
and I am still getting a triple fault.

My linker script is the same as in the http://wiki.osdev.org/Higher_Half_x86_Bare_Bones tutorial.

I have a working GDT and IDT, but I'm not sure if there's maybe a page fault that occurs and my code isn't handling it correctly and causing a triple fault.

I am using GRUB legacy and qemu. Using qemu's -d option with cpu_reset, the following is outputted:

Code: Select all

Triple fault
CPU Reset (CPU 0)
EAX=000e8e68 EBX=00033b40 ECX=80000011 EDX=00000000
ESI=0005a7d3 EDI=0005a7cd EBP=00067e7c ESP=00067e5c
EIP=0010023c EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
CS =0008 00000000 ffffffff 00cf9a00 DPL=0 CS32 [-R-]
SS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
DS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
FS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
GS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
GDT=     000090fc 00000027
IDT=     00000000 000003ff
CR0=80000011 CR2=c0104000 CR3=00104000 CR4=00000010
DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000 
DR6=ffff0ff0 DR7=00000400
CCS=00070120 CCD=000e8e68 CCO=LOGICL  
EFER=0000000000000000
FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=00000000000000000000000000000000 XMM01=00000000000000000000000000000000
XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000
XMM04=00000000000000000000000000000000 XMM05=00000000000000000000000000000000
XMM06=00000000000000000000000000000000 XMM07=00000000000000000000000000000000
There are a few CPU resets, but the above is the only one with a triple fault.

As you can see, the PSE bit has been set in CR4 and the page directory has been loaded just above 1MB physical address, as in CR3. The paging bit has been set in in CR0.

If there is any other information/code that is needed, let me know and I will add it.

Re: Triple fault after enabling paging

Posted: Tue Apr 04, 2017 2:03 pm
by Ch4ozz
Your mixing up so much stuff, re-read the wiki article again.
0x10 is there to enbale 4MiB pages, it does not activate paging itself plus its set in cr4 and not cr0.
Also have a good read about PSE and PAE here: https://en.wikipedia.org/wiki/Page_Size_Extension
TL;DR: You dont need it, dont set cr4 to 0x10

Now talking about qemu crash dumps.
I hope you know what the EIP register is.
Now with that info, you load your kernel binary in a disassembler (IDA Pro, onlinedisassembler, .....) and take a look at the address:
EIP=0010023c
This is the address where the code halted.

Now if you take a look at cr2, that is the address which triggered a page fault, since you use it in your interrupt handler it will page fault on a page fault, ... which results in a tripple fault.
CR2=c0104000

Now read up more on paging to understand every small detail about it.
Its one of the most needed and most used functions of an OS

Re: Triple fault after enabling paging

Posted: Tue Apr 04, 2017 2:50 pm
by 0xBADC0DE
I understand the part about setting 4MB pages using CR4 and using CR0 to enable paging. Paging is still a relatively new concept to me, so I will have to read up some more on it.

I opened up the binary in IDA Pro and EIP points to HigherHalf, where the following code is:

Code: Select all

mov dword [BootPageDirectory], 0
invlpg [0]
	
mov esp, (stack+STACK_SIZE)
push eax
	
push ebx
	
call k_main
hlt
and CR2 points to the label BootPageDirectory, which has the code

Code: Select all

dd 0x00000083
times (KERNEL_PAGE_NUMBER - 1) dd 0

this PD entry defines a 4MB page containing the kernel

times (1024 - KERNEL_PAGE_NUMBER - 1) dd 0	; number of pages after the kernel image ??
HigherHalf has a hlt instruction, so I don't think that the problem is in HigherHalf. As for BootPageDirectory, I don't know why that would be causing a page fault. Could it be an invalid page directory or the page directory not being setup correctly?

Thanks

Re: Triple fault after enabling paging

Posted: Tue Apr 04, 2017 2:54 pm
by Ch4ozz

Code: Select all

mov dword [BootPageDirectory], 0
This is the exact location which caused a trippple fault

Furthermore we know that cr2 is BootPageDirectory

So its obvious that you didnt map the memory arround BootPageDirectory correctly

Re: Triple fault after enabling paging

Posted: Thu Apr 06, 2017 9:47 am
by 0xBADC0DE
OK, I managed to fix the problem by following the Setting up paging tutorial, but now I have another problem with my higher half kernel.

Code: Select all

lea ecx, [HigherHalf]
jmp ecx
The above code is not working properly. When I try and load the address of HigherHalf into ecx and then jump to it, the operating system just crashes (not even a triple fault).

Re: Triple fault after enabling paging

Posted: Thu Apr 13, 2017 2:58 am
by osdever
<offtopic>
Hey, I am osdever!
</offtopic>