Debug question
Posted: Thu Dec 22, 2016 2:08 am
Ok, in short, I've messed up something with my APIC+IOAPIC code which generates a double fault in qemu as well as in bochs. I wanted to see why, so I've recompiled bochs to print out all interrupt calls. This is what I've got:
As you can see, everything seems to be okay:
bochs:2 - the iret in kernel space is executed, and control transfers to user space.
bochs:3 - in userspace the stack is ok, mapped, a valid return addresses on top of the stack etc.
bochs:4 - but when I step through the "ret", double fault raised IMMEDIATELY!
bochs:5 - my double fault handler starts
How is that possible? It's not a pending interrupt or NMI, that would have been printed out as well! It's not a page fault either! And it's definitely not a problem in one of my ISRs, as a) they are working just fine with PIC, b) the first ISR that gets called is the double fault's handler, right away!
I've tried to mask NMI to be sure, and printed out bochs' signal_event() calls as well, but nothing, the double fault is the first exception. According to the AMD and Intel specs, that should never ever happen! And it's not a problem with bochs, as it's also raised in qemu...
So my question is: has anybody seen something like this before? How to debug this? Any ideas?
EDIT: here's the output with PIC:
Code: Select all
(0) [0x0000001057a6] 0008:ffffffffffe077a6 (unk. ctxt): iret ; 48cf
<bochs:2> s
Next at t=16454120
(0) [0x0000003180e8] 0023:00000000002000e8 (unk. ctxt): ret ; c3
<bochs:3> print-stack
Stack address size 8
| STACK 0x00000000001fffa0 [0x00000000:0x002080e9]
| STACK 0x00000000001fffa8 [0x00000000:0x002060e9]
| STACK 0x00000000001fffb0 [0x00000000:0x002040e9]
| STACK 0x00000000001fffb8 [0x00000000:0x002020e8]
| STACK 0x00000000001fffc0 [0x00000000:0x002000f9]
| STACK 0x00000000001fffc8 [0x00000000:0x00000001]
| STACK 0x00000000001fffd0 [0x00000000:0x001fffd8]
| STACK 0x00000000001fffd8 [0x00000000:0x001fffe8]
| STACK 0x00000000001fffe0 [0x00000000:0x00000000]
| STACK 0x00000000001fffe8 [0x00006d65:0x74737973]
| STACK 0x00000000001ffff0 [0x00000000:0x00000000]
| STACK 0x00000000001ffff8 [0x00000000:0x00000000]
| STACK 0x0000000000200000 [0x00010102:0x464c457f]
| STACK 0x0000000000200008 [0x00000000:0x00000000]
| STACK 0x0000000000200010 [0x00000001:0x003e0003]
| STACK 0x0000000000200018 [0x00000000:0x000000f9]
<bochs:4> s
00016454120e[CPU0 ] interrupt(): vector = 08, TYPE = 0, EXT = 1
Next at t=16454121
(0) [0x000000103081] 0008:ffffffffffe05081 (unk. ctxt): lock bts qword ptr ds:0xffffffffffe1605c, 0x00 ; f0480fba2c255c60e1ff00
<bochs:5>
bochs:2 - the iret in kernel space is executed, and control transfers to user space.
bochs:3 - in userspace the stack is ok, mapped, a valid return addresses on top of the stack etc.
bochs:4 - but when I step through the "ret", double fault raised IMMEDIATELY!
bochs:5 - my double fault handler starts
How is that possible? It's not a pending interrupt or NMI, that would have been printed out as well! It's not a page fault either! And it's definitely not a problem in one of my ISRs, as a) they are working just fine with PIC, b) the first ISR that gets called is the double fault's handler, right away!
I've tried to mask NMI to be sure, and printed out bochs' signal_event() calls as well, but nothing, the double fault is the first exception. According to the AMD and Intel specs, that should never ever happen! And it's not a problem with bochs, as it's also raised in qemu...
So my question is: has anybody seen something like this before? How to debug this? Any ideas?
EDIT: here's the output with PIC:
Code: Select all
(0) [0x0000003180e8] 0023:00000000002000e8 (unk. ctxt): ret ; c3
<bochs:3> print-stack
Stack address size 8
| STACK 0x00000000001fffa0 [0x00000000:0x002080e9]
| STACK 0x00000000001fffa8 [0x00000000:0x002060e9]
| STACK 0x00000000001fffb0 [0x00000000:0x002040e9]
| STACK 0x00000000001fffb8 [0x00000000:0x002020e8]
| STACK 0x00000000001fffc0 [0x00000000:0x002000f9]
| STACK 0x00000000001fffc8 [0x00000000:0x00000001]
| STACK 0x00000000001fffd0 [0x00000000:0x001fffd8]
| STACK 0x00000000001fffd8 [0x00000000:0x001fffe8]
| STACK 0x00000000001fffe0 [0x00000000:0x00000000]
| STACK 0x00000000001fffe8 [0x00006d65:0x74737973]
| STACK 0x00000000001ffff0 [0x00000000:0x00000000]
| STACK 0x00000000001ffff8 [0x00000000:0x00000000]
| STACK 0x0000000000200000 [0x00010102:0x464c457f]
| STACK 0x0000000000200008 [0x00000000:0x00000000]
| STACK 0x0000000000200010 [0x00000001:0x003e0003]
| STACK 0x0000000000200018 [0x00000000:0x000000f9]
<bochs:4> page 0x1fffa0
PML4: 0x000000000001b027 ps A pcd pwt U W P
PDPE: 0x000000000001c027 ps A pcd pwt U W P
PDE: 0x800000000001d027 XD ps A pcd pwt U W P
PTE: 0x8000000000020007 XD g pat d a pcd pwt U W P
linear page 0x00000000001ff000 maps to physical page 0x000000020000
<bochs:5> s
Next at t=16452006
(0) [0x0000003470e9] 0023:00000000002080e9 (unk. ctxt): push rbp ; 55
<bochs:6>