Show excact cause an exception under bochs?

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
torshie
Member
Member
Posts: 89
Joined: Sun Jan 11, 2009 7:41 pm

Show excact cause an exception under bochs?

Post by torshie »

Is it possible to let bochs show me the exact cause of an exception? I have been hunting a page fault for about one day, still cannot find the cause of it. The error code is 0x15 :cry:

Thanks
-torshie
immibis
Posts: 19
Joined: Fri Dec 18, 2009 12:38 am

Re: Show excact cause an exception under bochs?

Post by immibis »

If it helps, you can look at CR2 to find out what address was accessed.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Show excact cause an exception under bochs?

Post by Brendan »

Hi,
torshie wrote:Is it possible to let bochs show me the exact cause of an exception? I have been hunting a page fault for about one day, still cannot find the cause of it. The error code is 0x15 :cry:
For page fault, error code 0x15 means:
  • fault caused by "page not present"
  • the access was a read
  • the access was made by code running at CPL >= 1
  • fault was not caused by reserved bits being set
  • fault caused by instruction fetch
Basically, user mode code tried to execute something in a "not present" page.

CR2 will contain the virtual address in the "not present" page.

I'd guess the problem is either in your page tables (a page which should be present isn't), or your code is dodgy (e.g. "call NULL" or "jmp bad_address" or something).


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
torshie
Member
Member
Posts: 89
Joined: Sun Jan 11, 2009 7:41 pm

Re: Show excact cause an exception under bochs?

Post by torshie »

The situation is like this:
I setup %r11 = 0x200, %rcx = 0x1006 and use "sysretq" to jump to 64-bit user code, at the very first instruction of the user code, i got the page fault.
The problem is that, the page DOES present and readable/writable/executable.

Here is the console output of bochs. I also added some inline comments

Code: Select all

(0) Magic breakpoint
Next at t=372514338
(0) [0x00000000002044ad] 0008:ffffffff802044ad (unk. ctxt): sysret                    ; 480f07  <<<<<<< the instruction to be executed
<bochs:2> show dbg-all
Turned ON all bx_dbg flags
<bochs:3> regs
CPU0:
rax: 0x00007fff:bfffe000 rcx: 0x00000000:00001006     <<<<<<<< RCX would be loaded into RIP
rdx: 0x00000000:00001006 rbx: 0xffffffff:80226070
rsp: 0x00007fff:bfffe000 rbp: 0x00000000:00000000
rsi: 0x00000000:012bf000 rdi: 0xffffffff:80226070
r8 : 0x00000000:00000000 r9 : 0x00000000:00000000
r10: 0x00000000:00000000 r11: 0x00000000:00000200  <<<<<<< R11 would be loaded into RFLAGS
r12: 0x00000000:00000000 r13: 0x00000000:00000000
r14: 0x00000000:00000000 r15: 0x00000000:00000000
rip: 0xffffffff:802044ad
eflags 0x00200282: ID vip vif ac vm rf nt IOPL=0 of df IF tf SF zf af pf cf
<bochs:4> s
Next at t=372514339
(0) [0x00000000012ba006] 002b:0000000000001006 (unk. ctxt): call .-11 (0x0000000000001000) ; e8f5ffffff
<bochs:5> regs
CPU0:
rax: 0x00007fff:bfffe000 rcx: 0x00000000:00001006
rdx: 0x00000000:00001006 rbx: 0xffffffff:80226070
rsp: 0x00007fff:bfffe000 rbp: 0x00000000:00000000
rsi: 0x00000000:012bf000 rdi: 0xffffffff:80226070
r8 : 0x00000000:00000000 r9 : 0x00000000:00000000
r10: 0x00000000:00000000 r11: 0x00000000:00000200
r12: 0x00000000:00000000 r13: 0x00000000:00000000
r14: 0x00000000:00000000 r15: 0x00000000:00000000
rip: 0x00000000:00001006                     <<<<<<<<<< RIP loaded as expected
eflags 0x00000202: id vip vif ac vm rf nt IOPL=0 of df IF tf sf zf af pf cf    <<<<<<< EFLAGS also loaded as expected
<bochs:6> creg
CR0=0x80000011: PG cd nw ac wp ne ET ts em mp PE
CR2=page fault laddr=0x0000000000000000
CR3=0x00000000012b6000
    PCD=page-level cache disable=0
    PWT=page-level write-through=0
CR4=0x000002a0: osxsave pcid fsgsbase smx vmx osxmmexcpt OSFXSR pce PGE mce PAE pse de tsd pvi vme
EFER=0x00000d01: ffxsr NXE LMA LME SCE
<bochs:7> sreg                          <<<<<<<<< CS & SS loaded as expected, I'm not sure about the other segment registers, should I load them manually ?
es:0x0010, dh=0x00009300, dl=0x00000000, valid=1
        Data segment, base=0x00000000, limit=0x00000000, Read/Write, Accessed
cs:0x002b, dh=0x00affb00, dl=0x0000ffff, valid=7
        Code segment, base=0x00000000, limit=0xffffffff, Execute/Read, Accessed, 16-bit
ss:0x0023, dh=0x0000f300, dl=0x00000000, valid=7
        Data segment, base=0x00000000, limit=0x00000000, Read/Write, Accessed
ds:0x0010, dh=0x00009300, dl=0x00000000, valid=1
        Data segment, base=0x00000000, limit=0x00000000, Read/Write, Accessed
fs:0x0010, dh=0x00009300, dl=0x00000000, valid=1
        Data segment, base=0x00000000, limit=0x00000000, Read/Write, Accessed
gs:0x0010, dh=0x00009300, dl=0x00000000, valid=1
        Data segment, base=0x00000000, limit=0x00000000, Read/Write, Accessed
ldtr:0x0000, dh=0x00008200, dl=0x0000ffff, valid=0
tr:0x0030, dh=0x80008b22, dl=0x14600067, valid=1
gdtr:base=0xffffffff80221400, limit=0x3f
idtr:base=0xffffffff802214e0, limit=0x3ff
<bochs:8> page 0x1000                 <<<<<<<<<<<     see, the page does exist, U/S bit is set on PTE.
PML4: 0x00000000012b7063 ps         A pcd pwt S W P
PDPE: 0x00000000012b8063 ps         A pcd pwt S W P
 PDE: 0x00000000012b9063 ps         A pcd pwt S W P
 PTE: 0x00000000012ba067    g pat D A pcd pwt U W P
linear page 0x0000000000001000 maps to physical page 0x00000000012ba000
<bochs:9> u /5 0x1000
00001000: (                    ): mov eax, 0x00000000       ; b800000000
00001005: (                    ): ret                       ; c3
00001006: (                    ): call .-11                 ; e8f5ffffff                  <<<<<<<<< Current RIP
0000100b: (                    ): jmp .-2                   ; ebfe
0000100d: (                    ): add byte ptr ds:[rax], al ; 0000
<bochs:10> s
CPU 0: Exception 0x0e - (#PF) page fault occured (error_code=0x0015)
CPU 0: Interrupt 0x0e occured (error_code=0x0015)
Next at t=372514340
(0) [0x0000000000204870] 0008:ffffffff80204870 (unk. ctxt): push rax                  ; 50         <<<<<<<<< The first instruction of page fault handler.
<bochs:11> creg
CR0=0x80000011: PG cd nw ac wp ne ET ts em mp PE
CR2=page fault laddr=0x0000000000001006                    <<<<<<<<<<< Page fault address is 0x1006
CR3=0x00000000012b6000
    PCD=page-level cache disable=0
    PWT=page-level write-through=0
CR4=0x000002a0: osxsave pcid fsgsbase smx vmx osxmmexcpt OSFXSR pce PGE mce PAE pse de tsd pvi vme
EFER=0x00000d01: ffxsr NXE LMA LME SCE
<bochs:12> page 0x00007fffbfffd000                       <<<<<<< The is the top of user stack, user stack is set to 0x00007fffbfffe000
PML4: 0x00000000012bb063 ps         A pcd pwt S W P
PDPE: 0x00000000012bc063 ps         A pcd pwt S W P
 PDE: 0x00000000012bd063 ps         A pcd pwt S W P
 PTE: 0x80000000012bf007    g pat d a pcd pwt U W P
linear page 0x00007fffbfffd000 maps to physical page 0x00000000012bf000
torshie
Member
Member
Posts: 89
Joined: Sun Jan 11, 2009 7:41 pm

Re: Show excact cause an exception under bochs?

Post by torshie »

BTW, error code 0x15 means the #PF is caused by a page-protection violation, not by a not-present page.

This is from AMD's system programming manual:
P—Bit 0. If this bit is cleared to 0, the page fault was caused by a not-present page. If this bit is set
to 1, the page fault was caused by a page-protection violation.
jnc100
Member
Member
Posts: 775
Joined: Mon Apr 09, 2007 12:10 pm
Location: London, UK
Contact:

Re: Show excact cause an exception under bochs?

Post by jnc100 »

The first bit in the error code is set if the fault was caused by a page level protection violation and clear if it is caused by a non-present page. In your case it is set, so the issue is not that the page is missing but with the access rights to it. Essentially for user mode accesses, the U/S flag must be 1 in _every_ paging structure entry controlling the translation (i.e. throughout the hierarchy rather than just at page table level). What you need to do is set all PML4T, PDPT and PD entries to be user mode then use the flag in the PT to set the actual access rights.

Regards,
John.
torshie
Member
Member
Posts: 89
Joined: Sun Jan 11, 2009 7:41 pm

Re: Show excact cause an exception under bochs?

Post by torshie »

jnc100 wrote: Essentially for user mode accesses, the U/S flag must be 1 in _every_ paging structure entry controlling the translation (i.e. throughout the hierarchy rather than just at page table level). What you need to do is set all PML4T, PDPT and PD entries to be user mode then use the flag in the PT to set the actual access rights.

Regards,
John.
Thank you for your help. You are right, I set all the U/S bit, then it's OK now :D
I was mislead by the "info mem" command of QEMU. The command shows the "u" flag as long as you set the U/S flag at the page table level !

Thanks again

-torshie
Post Reply