Paging Problem on some HW: triple fault on enabling
Posted: Wed Apr 25, 2018 3:22 am
Hi everyone, I'm an Italian student (so excuse me if my english is poor). I'm rewriting my very basic OS, but trying to improve it so much, and I decided to implement x86_64 for the first time.
As currently I'm working on the bootloader, and writing it in assembler, my environment consists consists in NASM, bochs, other universal tools like an hex-editor, running in LinuxMint 18.3 'Sylvia' on a HP P6 Pavilion (2011) laptop machine. Everything was working very well, I was able to enter long mode, and I identity mapped the first 16Gb of RAM (to grant me access to the whole memory, waiting to encode a proper memory manager) using PSE. I then kept programming and started setting up the environment of my os, like IRQ handlers, exception handlers, my system call, basic I/O environment, when I decided to test it on real hardware (the same laptop i'm working on), via USB seen as an hard disk booting. It triple faulted. I tried on a different laptop of the same series, but the result was the same. So I tried on different PCs but all was fine.
I began trying to sort out which instruction was causing trouble, and I found that was the MOV CR0,EAX that enables paging. So i started debugging my paging system, and rewrote it as a very simple identity allocation of the first 2Mb, no PSE, as seen in tutorials, but the problem persists. I tried to do a lot of little changes like even flipping the page general enable bit, but I can't find out the problem. Also I cannot get and handle the exception, as my handlers seems to get never called (to be sure of this i put hangs in the exception handlers) and the cpu resets immediatly, as it would if i set a not page-aligned CR3.
This is driving me mad, and I'm running out of ideas on how debug this, so I thought to call for help.
Here's my original paging setup code (now disabled):
I know that that's dodgy, in fact that was only a temporary solution , but it worked.
This is how i'm currently doing:
It crashes on the highlighted instructions. I have to say that I have exception handlers both at 32 and 64 bits,and they are tested, that the os checks for the capability of doing everything (long mode, pae, pse, extended cpuid, etc), that i'm sure that the problem is that instruction cause i put a hang before, and then after, and that the pcs i'm testing this on are teorically capable of doing those things. I really don't understand why only on some hp laptops this problem exists. I really hope that you guys can help me sort out the problem... any kind of suggestion is welcomed.
As currently I'm working on the bootloader, and writing it in assembler, my environment consists consists in NASM, bochs, other universal tools like an hex-editor, running in LinuxMint 18.3 'Sylvia' on a HP P6 Pavilion (2011) laptop machine. Everything was working very well, I was able to enter long mode, and I identity mapped the first 16Gb of RAM (to grant me access to the whole memory, waiting to encode a proper memory manager) using PSE. I then kept programming and started setting up the environment of my os, like IRQ handlers, exception handlers, my system call, basic I/O environment, when I decided to test it on real hardware (the same laptop i'm working on), via USB seen as an hard disk booting. It triple faulted. I tried on a different laptop of the same series, but the result was the same. So I tried on different PCs but all was fine.
I began trying to sort out which instruction was causing trouble, and I found that was the MOV CR0,EAX that enables paging. So i started debugging my paging system, and rewrote it as a very simple identity allocation of the first 2Mb, no PSE, as seen in tutorials, but the problem persists. I tried to do a lot of little changes like even flipping the page general enable bit, but I can't find out the problem. Also I cannot get and handle the exception, as my handlers seems to get never called (to be sure of this i put hangs in the exception handlers) and the cpu resets immediatly, as it would if i set a not page-aligned CR3.
This is driving me mad, and I'm running out of ideas on how debug this, so I thought to call for help.
Here's my original paging setup code (now disabled):
Code: Select all
TABLES CREATION:
; MOV EDI,2000h ;Setup PAGINING
; MOV CR3,EDI
; MOV ECX,4000h
; XOR EAX,EAX
;PGT_CLR_LP: MOV [EDI],AL ;Cleaning pages i'm going to use
; LOOP PGT_CLR_LP
; MOV EDI,2000h
; MOV DWORD [EDI],3003h ;Setting PDPT0 in PML4T[0]
; ADD EDI,1000h
; MOV DWORD [EDI],4003h ;Setting PDT0 in PDPT0[0]
;ADD EDI,08h
;MOV DWORD [EDI],5003h ;Setting PDT1 in PDPT0[1]
;ADD EDI,08h
;MOV DWORD [EDI],6003h ;Setting PDT2 in PDPT0[2]
;ADD EDI,08h
;MOV DWORD [EDI],101003h ;Setting PDT3 in PDPT0[3]
;ADD EDI,08h
;MOV DWORD [EDI],102003h ;Setting PDT4 in PDPT0[4]
;ADD EDI,08h
;MOV DWORD [EDI],103003h ;Setting PDT5 in PDPT0[5]
;ADD EDI,08h
;MOV DWORD [EDI],104003h ;Setting PDT6 in PDPT0[6]
;ADD EDI,08h
;MOV DWORD [EDI],105003h ;Setting PDT7 in PDPT0[7]
;ADD EDI,08h
;MOV DWORD [EDI],106003h ;Setting PDT8 in PDPT0[8]
;ADD EDI,08h
;MOV DWORD [EDI],107003h ;Setting PDT9 in PDPT0[9]
;ADD EDI,08h
;MOV DWORD [EDI],108003h ;Setting PDTA in PDPT0[A]
;ADD EDI,08h
;MOV DWORD [EDI],109003h ;Setting PDTB in PDPT0[B]
;ADD EDI,08h
;MOV DWORD [EDI],10A003h ;Setting PDTC in PDPT0[C]
;ADD EDI,08h
;MOV DWORD [EDI],10B003h ;Setting PDTD in PDPT0[D]
;ADD EDI,08h
;MOV DWORD [EDI],10C003h ;Setting PDTE in PDPT0[E]
;ADD EDI,08h
;MOV DWORD [EDI],10D003h ;Setting PDTF in PDPT0[F]
; MOV EDI,4000h
; MOV EBX,00000083h
; MOV ECX,00000600h
; PUSH DBG32_4
; CALL DEBUG
;PGT_IDENTITY: MOV [EDI],EBX ;Setting up tables to identity map
; ADD EBX,200000h
; ADD EDI,0008h
; LOOP PGT_IDENTITY;
; MOV EDI,101000h
; MOV EBX,0C0000083h
; MOV ECX,1A00h
; MOV EDX,00h
;PGT_ID2: MOV [EDI],EBX
; ADD EDI,04h
; MOV [EDI],EDX
; ADD EBX,200000h
; JO PGT_IDO
;PGT_ID2C: ADD EDI,0004h
; LOOP PGT_ID2
; JMP LONGYS
;PGT_IDO: ADD EDX,01h
; MOV EBX,00h
; JMP PGT_ID2C
SWITCHING TO LONG MODE:
LONGYS: PUSH DBG32_5
CALL DEBUG
PUSH DBG32_6
CALL DEBUG
MOV EAX,CR4 ;Switch to LONG MODE
OR EAX,110000b ;Setting PAE(bit5) e il PSE(bit4)
MOV CR4,EAX
PUSH DBG32_7
CALL DEBUG
MOV ECX,0C0000080h ;Asking for EFER MSR (0xC0000080h)
RDMSR
OR EAX,100000000b ;Setting LM-bit (bit 8).
WRMSR
PUSH DBG32_8
CALL DEBUG
PUSH DBG32_9
CALL DEBUG
MOV EAX,CR0
OR EAX,80000001h ;Setting Paging(bit31)
---------------MOV CR0,EAX CRASHES HERE----------------------------------------------------------------------
LGDT [GDT64] ;LOADING GDT
LIDT [IDT64] ;Loading IDT
JMP 08h:LONG_MODE ;Jumping to a 64 bit segment
JMP 64_ERR
This is how i'm currently doing:
Code: Select all
Setting up tables:
MOV EDI,2000h ;PML4 at 0x2000
MOV CR3,EDI ;Setting CR3
XOR EAX,EAX
MOV ECX,4000h
CLR_PAG: MOV [EDI],AL ;Wiping pages 2000h,3000h,4000h,5000h
LOOP CLR_PAG
MOV EDI,CR3 o
MOV DWORD [EDI],3003h ;Setting PDPT0 in PML4[0]
ADD EDI,1000h
MOV DWORD [EDI],4003h ;Setting PDT0 in PDPT0[0]
ADD EDI,1000h
MOV DWORD [EDI],5003h ;Setting PT0 in PDT0[0]
MOV EBX,0003h ;Identity mapping first 512 pages in PT0
MOV ECX,0200h
MOV EDI,5000h
FILL_PT: MOV [EDI],EBX
ADD EBX,1000h
ADD EDI,08h
LOOP FILL_PT
Switching to long mode:
LONGYS: PUSH DBG32_5
CALL DEBUG
PUSH DBG32_6
CALL DEBUG
MOV EAX,CR4 ;Switch to LONG MODE
OR EAX,10100000b ;Setting PGE(bit7) and PAE(bit5)
MOV CR4,EAX
PUSH DBG32_7
CALL DEBUG
MOV ECX,0C0000080h ;Asking for EFER MSR (0xC0000080h)
RDMSR
OR EAX,100000000b ;Setting LM-bit (bit 8).
WRMSR
PUSH DBG32_8
CALL DEBUG
PUSH DBG32_9
CALL DEBUG
MOV EAX,CR0
OR EAX,80000001h ;Setting Paging(bit31)
---------------MOV CR0,EAX CRASHES HERE------------------------------------------------------------------------------------------------
LGDT [GDT64] ;Loading GDT
LIDT [IDT64] ;Loading IDT
JMP 08h:LONG_MODE ;Jumping to a 64 bit segmen
JMP NO_64