Code: Select all
Next at t=31506443
(0) [0x0000000000206cf7] 0008:0000000000206cf7 (unk. ctxt): rep stosq qword ptr es:[rdi], rax ; f348ab
<bochs:5> print-stack
Stack address size 8
| STACK 0x0000000000a34d40 [0x00000000:0x002019a8]
| STACK 0x0000000000a34d48 [0x00000040:0x00000270]
| STACK 0x0000000000a34d50 [0x00000000:0x00000040]
| STACK 0x0000000000a34d58 [0x00000000:0x002171e4]
| STACK 0x0000000000a34d60 [0x00000000:0x00000020]
| STACK 0x0000000000a34d68 [0x00000000:0x002171f0]
| STACK 0x0000000000a34d70 [0x00000000:0x002171e0]
| STACK 0x0000000000a34d78 [0x00000000:0x00a34db8]
| STACK 0x0000000000a34d80 [0x00000000:0x0020ae84]
| STACK 0x0000000000a34d88 [0x00000020:0x00210b00]
| STACK 0x0000000000a34d90 [0x00000000:0x00000020]
| STACK 0x0000000000a34d98 [0x00000000:0x00210700]
| STACK 0x0000000000a34da0 [0x00000000:0x0000000a]
| STACK 0x0000000000a34da8 [0x00000000:0x00000000]
| STACK 0x0000000000a34db0 [0x00000000:0x0000000a]
| STACK 0x0000000000a34db8 [0x00000000:0x00a34ef8]
Looking at the stack (since you didn't tell me %rip before the call to memset, i shall have to assume from the stack)
Code: Select all
| STACK 0x0000000000a34d78 [0x00000000:0x00a34db8]
This appears to be the 'saved RBP' bit of the stack; +8 should be the return address: 0x0020ae84... Which happens to be some CPUID stuff.
That thing, i must admit, isn't my code, but it hasn't broken before...
And it's tempting to want to see what's below
Code: Select all
| STACK 0x0000000000a34db8 [0x00000000:0x00a34ef8]
...
Some questions:
1. Is there any output on the screen?
2. If so, what output? If things go as expected, you should at least see a version number and the GRUB memory map. (If the fault is indeed in the CPUID code)
3. If not, I think bochsrc may need some adjustment :p
Next...
Code: Select all
<bochs:6> info tab
cr3: 0x0000000000001000
0x00000000-0x007dafff -> 0x0000000000000000-0x00000000007dafff
0x007db000-0x007dbfff -> 0x0000000000207000-0x0000000000207fff
0x007e1000-0x007e1fff -> 0x000000000020a000-0x000000000020afff
0x007ed000-0x007edfff -> 0x0000000000000000-0x0000000000000fff
0x007ef000-0x007effff -> 0x0000000000000000-0x0000000000000fff
0x007f1000-0x007f1fff -> 0x0000000000205000-0x0000000000205fff
0x007f6000-0x007f6fff -> 0x0000001f007f6000-0x0000001f007f6fff
0x007fd000-0x007fdfff -> 0x0000000000204000-0x0000000000204fff
0x00a33000-0x00a3afff -> 0x0000000000a33000-0x0000000000a3afff
0xfffff000-0xffffffff -> 0x00000000fffff000-0x00000000ffffffff
Looks normal.
Code: Select all
<bochs:9> info gdt
Global Descriptor Table (base=0x00000000002007b0, limit=55):
GDT[0x00]=??? descriptor hi=0x00000000, lo=0x00000000
GDT[0x01]=Code segment, base=0x00000000, limit=0xffffffff, Execute/Read, [b]Conforming, Accessed, 64-bit (Conforming, why?!)[/b]
GDT[0x02]=Data segment, base=0x00000000, limit=0xffffffff, Read/Write, Accessed
GDT[0x03]=Code segment, base=0x00000000, limit=0xffffffff, Execute/Read, [b]Conforming[/b], 64-bit
GDT[0x04]=Data segment, base=0x00000000, limit=0xffffffff, Read/Write
GDT[0x05]=32-Bit TSS (Available) at 0x00500000, length 0x00068
GDT[0x06]=??? descriptor hi=0x00000000, lo=0x00000000
I'm not sure what caused me to choose 'conforming' over non-conforming... I think it was something I read in the manual about switching to conforming/non-conforming segments from a different privilege level... But it escapes me now.
Could this be the cause of CS being 0xB?
Code: Select all
<bochs:10> info idt 0 63
Interrupt Descriptor Table (base=0x0000000000216080, limit=4095):
[b]IDT[0x00]=64-Bit Interrupt Gate target=0x0008:0000000000204724, DPL=3[/b] (Wrong! This allows any task to call INT 0 rather than being exclusively for #DE)
Correct! I intentionally set it as such; it's easy to identify. I just use INT $0x00 to test if I can return to Ring0 with interrupts.
Code: Select all
<bochs:16> x /40x 0x216080
[bochs]:
0x0000000000216080 <bogus+ 0>: 0x00084724 0x0020[b][u]e[/u][/b]e[b][u]01[/u][/b] 0x00000000 0x00000000 (Why the IST?!)
0x0000000000216090 <bogus+ 16>: 0x0008472d 0x00208e[b]01[/b] 0x00000000 0x00000000
0x00000000002160a0 <bogus+ 32>: 0x00084736 0x00208e[b]01[/b] 0x00000000 0x00000000
0x00000000002160b0 <bogus+ 48>: 0x0008473f 0x00208e[b]01[/b] 0x00000000 0x00000000
0x00000000002160c0 <bogus+ 64>: 0x00084748 0x00208e[b]01[/b] 0x00000000 0x00000000
0x00000000002160d0 <bogus+ 80>: 0x00084751 0x00208e[b]01[/b] 0x00000000 0x00000000
0x00000000002160e0 <bogus+ 96>: 0x0008475a 0x00208e[b]01[/b] 0x00000000 0x00000000
0x00000000002160f0 <bogus+ 112>: 0x00084763 0x00208e[b]01[/b] 0x00000000 0x00000000
0x0000000000216100 <bogus+ 128>: 0x0008476c 0x00208e[b]01[/b] 0x00000000 0x00000000
0x0000000000216110 <bogus+ 144>: 0x00084773 0x00208e[b]01[/b] 0x00000000 0x00000000
I'm going to assume this is a dump of the IDT in-memory...
Why not the IST?
Again: 0xEE instead of 0x8E is to set DPL=3 so I can INT $0x00 from Ring3
The IST is supposed to set the interrupt handler to a known good stack... 0x9000 to be precise. Could it be Bochs having something there?
Code: Select all
<bochs:17> info tss
tr:[b]s=0x0, base=0x0000000000000000[/b], valid=1 (no TSS?!)
Yes TSS: Again, this makes me think you're crashing somewhere before the attempt to enter user mode.
This should be it:
Code: Select all
tr:s=0x28, base=0x0000000000500000, valid=1
Again, Bochs insists on using the legacy-mode structure to display information, and I get register values, not IST values. But it doesn't matter, the TSS should be there.
Finally: You shouldn't need to recompile anything. If you can mount the disk image, (and you should be able to), go into
And set all the PRINTXXX things to true.
That should give a rough estimate of where the crash is occurring.
Bochs should change to 1024x640 (don't change the resolution, there's serious bugs in my scrolling code that somehow only makes it work for this resolution), 32bpp
You will always see the version number.
If you set PRINTMEM to true, you should see the GRUB memory map.
That shouldn't cause a crash, so you should see a printout of the total amount of memory.
If it crashes now, (ie. after displaying total memory) it's definitely the CPUID code.
If not, then you should see PCI information, followed by ATA drive information. Neither of which should cause a crash; both are simply displaying what was already probed previously.
After that, you should see something similar to READELF's output: ELF signature, etc. etc; Program headers and Section Headers.
Right after, I init the TSS and jump to user mode. That works flawlessly.
In usermode [src/userspace/hello/hello.cpp] I simply do INT $0x00.
That concludes my summary of what I'm doing (:
PS: What are the 'edge case stunts' you're talking about? My settings are fairly straightforward with the exception of multiple HDD images being used (which are there to test some FS things)...
Thanks Combuster!