[Solved] VirtualBox "mov es, ax" or "mov ss, ax" faults

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
sounds
Member
Member
Posts: 112
Joined: Sat Feb 04, 2012 5:03 pm

[Solved] VirtualBox "mov es, ax" or "mov ss, ax" faults

Post by sounds »

Solved. I was using VirtualBox 6.1.38 which apparently had broken VT-D virtualization. Switching to VirtualBox 6.1.6 solved it.

This code has worked fine in VirtualBox for a long time, but recently started causing a GPF. It's aborting the VM in the very first function after the jump to long mode. I'm testing on Virtualbox 6.1.38_Ubuntu r153438. I believe this problem started with VirtualBox 6.0.

It runs fine in bochs and qemu and on real hardware. VirtualBox doesn't log any error, it just shuts down the VM with "aborted."

Code: Select all

push r15            ; 4157 (gcc-generated prolog)
push r14            ; 4156 (gcc-generated prolog)
push r13            ; 4155 (gcc-generated prolog)
push r12            ; 4154 (gcc-generated prolog)
push rbp            ; 55 (gcc-generated prolog)
push rbx            ; 53 (gcc-generated prolog)
sub rsp, 0x000000d8 ; 4881ecd8000000 (gcc-generated prolog)

xor rax, rax        ; 4831c0 (handwritten assembly begins)
push rax            ; 50
popf                ; 9d
mov ax, 0x0010      ; 66b81000
mov ss, ax          ; 8ed0
mov rsp, 0x00007dea ; 48c7c4ea7d0000
mov ds, ax          ; 8ed8
mov es, ax          ; 8ec0 (VirtualBox aborts)
mov fs, ax          ; 8ee0
mov gs, ax          ; 8ee8
xor eax, eax        ; 31c0
This is (most likely) a VirtualBox bug, but I'm wondering if there's something I could try to just get the code to run - I'm looking for a workaround.

Switching the order gives a clue - it seems to be triggered when SS and RSP are set:

Code: Select all

xor rax, rax        ; 4831c0 (handwritten assembly begins)
push rax            ; 50
popf                ; 9d
mov ax, 0x0010      ; 66b81000
mov ds, ax          ; 8ed8 (Rearranged this, now DS and ES come before SS)
mov es, ax          ; 8ec0
mov ss, ax          ; 8ed0 (VirtualBox aborts)
mov rsp, 0x00007dea ; 48c7c4ea7d0000
mov fs, ax          ; 8ee0
mov gs, ax          ; 8ee8
xor eax, eax        ; 31c0
To be clear, since VirtualBox doesn't log any information and the 'p' single step command in the VirtualBox debugger causes VirtualBox to freeze, I'm using "1: jmp 1b" as suggested on the wiki to bisect what instruction is causing the abort.

While I'm here, I'm looking for ways to convince gcc 9.4.0-1ubuntu1 to omit the prolog on this function. The push and sub rsp instructions are relatively harmless but pointless.
Last edited by sounds on Thu Feb 02, 2023 11:08 pm, edited 1 time in total.
sounds
Member
Member
Posts: 112
Joined: Sat Feb 04, 2012 5:03 pm

Re: VirtualBox "mov es, ax" (or maybe "mov ss, ax"?) faults

Post by sounds »

I browsed the forums looking for clues, and found some other posts:
I also searched for ways to get more info from VirtualBox:
Octocontrabass
Member
Member
Posts: 5581
Joined: Mon Mar 25, 2013 7:01 pm

Re: VirtualBox "mov es, ax" (or maybe "mov ss, ax"?) faults

Post by Octocontrabass »

sounds wrote:This is (most likely) a VirtualBox bug, but I'm wondering if there's something I could try to just get the code to run - I'm looking for a workaround.
What are the contents of GDTR? What are the contents of the GDT that GDTR points to? What are the attributes of the page(s) that map your GDT? What is the current privilege level? Make sure you're checking these values immediately before loading SS - if you have a memory corruption bug, the problem may not appear if you check earlier than that. If it is a VirtualBox bug, it'll be easier to find a workaround if we know exactly what it is you're trying to do that doesn't work.
sounds wrote:While I'm here, I'm looking for ways to convince gcc 9.4.0-1ubuntu1 to omit the prolog on this function. The push and sub rsp instructions are relatively harmless but pointless.
You can use the "naked" attribute to tell GCC to skip generating those sequences, but it won't work properly if you put anything other than a basic asm() statement in the function. Also, it's a good idea to use a cross-compiler instead of your host compiler.
sounds wrote:I found some examples of VirtualBox guests that logged something before they were terminated (example) but the logs I have don't have anything.
Can you share the logs anyway?
nullplan
Member
Member
Posts: 1798
Joined: Wed Aug 30, 2017 8:24 am

Re: VirtualBox "mov es, ax" (or maybe "mov ss, ax"?) faults

Post by nullplan »

sounds wrote:While I'm here, I'm looking for ways to convince gcc 9.4.0-1ubuntu1 to omit the prolog on this function. The push and sub rsp instructions are relatively harmless but pointless.
The single best way to do that is to not use C at that point. Write the initial entry point in assembler (which you are already doing, but do so in its own function), then call a C function to go on. You can use inline assembler on file scope or an assembler source file. E.g. in your case, something like

Code: Select all

.global _start
_start:
  movw $0x10, %ax
  movw %ax, %ds
  movw %ax, %es
  movw %ax, %fs
  movw %ax, %gs
  movw %ax, %ss
  movl $stack_end, %esp
  xorl %ebp, %ebp
  andl $-16, %esp
  callq _start_c
  ud2 #should not return.
As per ABI, on entry to a function the stack pointer has to be 8 bytes off from being 16 bytes aligned, which is why you cannot just do a jump instruction.

Of course the push instructions hurt something. They write to stack before you have the stack set up. Then you change the stack pointer as part of the assembler snippet, which is also not going to work with the compiler long-term. Just use well-defined unchanging interfaces, like the ABI.
Carpe diem!
sounds
Member
Member
Posts: 112
Joined: Sat Feb 04, 2012 5:03 pm

Re: VirtualBox "mov es, ax" (or maybe "mov ss, ax"?) faults

Post by sounds »

Thanks, everyone, for the helpful responses. :D
nullplan wrote:You can use inline assembler on file scope
Thanks, I will try following this example: https://elixir.bootlin.com/linux/latest ... ioscalls.c
nullplan wrote:Of course the push instructions hurt something. They write to stack before you have the stack set up. Then you change the stack pointer as part of the assembler snippet, which is also not going to work with the compiler long-term. Just use well-defined unchanging interfaces, like the ABI.
I already set the stack pointer while in real mode to the top of a 30KB stack, and this is the first time that stack is getting used. Could VirtualBox be breaking due to the push instructions, even though real hardware works fine?

...
Octocontrabass wrote:What are the contents of GDTR? What are the contents of the GDT that GDTR points to? What are the attributes of the page(s) that map your GDT? What is the current privilege level? Make sure you're checking these values immediately before loading SS

Code: Select all

Bochs connected to screen "/dev/pts/2"
Next at t=0
(0) [0x0000fffffff0] f000:fff0 (no symbol): jmpf 0xf000:e05b          ; ea5be000f0
<bochs:1> c
^CNext at t=948344304
(0) [0x00000000c01e] 0008:000000000000c01e (code+1e): jmp .-2 (0x000000000000c01e) ; ebfe
<bochs:2> disasm 0xc000 0xc030
000000000000c000: (              code+0): push r15                  ; 4157
000000000000c002: (              code+2): push r14                  ; 4156
000000000000c004: (              code+4): push r13                  ; 4155
000000000000c006: (              code+6): push r12                  ; 4154
000000000000c008: (              code+8): push rbp                  ; 55
000000000000c009: (              code+9): push rbx                  ; 53
000000000000c00a: (              code+a): sub rsp, 0x00000000000000d8 ; 4881ecd8000000
000000000000c011: (             code+11): xor rax, rax              ; 4831c0
000000000000c014: (             code+14): push rax                  ; 50
000000000000c015: (             code+15): popf                      ; 9d
000000000000c016: (             code+16): mov ax, 0x0010            ; 66b81000
000000000000c01a: (             code+1a): mov ds, ax                ; 8ed8
000000000000c01c: (             code+1c): mov es, ax                ; 8ec0
000000000000c01e: (             code+1e): jmp .-2 (0x000000000000c01e) ; ebfe
000000000000c020: (             code+20): mov ss, ax                ; 8ed0
000000000000c022: (             code+22): mov rsp, 0x0000000000007dea ; 48c7c4ea7d0000
000000000000c029: (             code+29): mov fs, ax                ; 8ee0
000000000000c02b: (             code+2b): mov gs, ax                ; 8ee8
000000000000c02d: (             code+2d): xor eax, eax              ; 31c0
000000000000c02f: (             code+2f): xor ebx, ebx              ; 31db
<bochs:3> creg
CR0=0xe0040033: PG CD NW AC wp NE ET ts em MP PE
CR2=page fault laddr=0x0000000000000000
CR3=0x000000008000
    PCD=page-level cache disable=0
    PWT=page-level write-through=0
CR4=0x000007a6: cet pke smap smep osxsave pcid fsgsbase smx vmx OSXMMEXCPT umip OSFXSR PCE PGE mce PAE pse de TSD PVI vme
CR8: 0x0
EFER=0x00000501: ffxsr nxe LMA LME SCE
XCR0=0x00000001: pkru hi_zmm zmm_hi256 opmask bndcfg bndregs ymm sse FPU
<bochs:4> sreg
es:0x0010, dh=0x00009300, dl=0x00000000, valid=1
	Data segment, base=0x00000000, limit=0x00000000, Read/Write, Accessed
cs:0x0008, dh=0x00209900, dl=0x00000000, valid=1
	Code segment, base=0x00000000, limit=0x00000000, Execute-Only, Non-Conforming, Accessed, 64-bit
ss:0x0002, dh=0x00009300, dl=0x0020ffff, valid=7
	Data segment, base=0x00000020, limit=0x0000ffff, Read/Write, Accessed
ds:0x0010, dh=0x00009300, dl=0x00000000, valid=1
	Data segment, base=0x00000000, limit=0x00000000, Read/Write, Accessed
fs:0x0000, dh=0x00009300, dl=0x0000ffff, valid=1
	Data segment, base=0x00000000, limit=0x0000ffff, Read/Write, Accessed
gs:0x0000, dh=0x00009300, dl=0x0000ffff, valid=1
	Data segment, base=0x00000000, limit=0x0000ffff, Read/Write, Accessed
ldtr:0x0000, dh=0x00008200, dl=0x0000ffff, valid=1
tr:0x0000, dh=0x00008b00, dl=0x0000ffff, valid=1
gdtr:base=0x0000000000008010, limit=0x1d
idtr:base=0x0000000000000000, limit=0x0
<bochs:5> x /16bx 0x8000
[bochs]:
0x0000000000008000 <bogus+       0>:	0x23	0x90	0x00	0x00	0x00	0x00	0x00	0x00
0x0000000000008008 <bogus+       8>:	0x00	0x00	0x00	0x00	0x00	0x00	0x00	0x00
<bochs:6> x /16bx 0x9000
[bochs]:
0x0000000000009000 <bogus+       0>:	0x23	0xa0	0x00	0x00	0x00	0x00	0x00	0x00
0x0000000000009008 <bogus+       8>:	0x00	0x00	0x00	0x00	0x00	0x00	0x00	0x00
<bochs:7> x /16bx 0xa000
[bochs]:
0x000000000000a000 <bogus+       0>:	0xe3	0x00	0x00	0x00	0x00	0x00	0x00	0x00
0x000000000000a008 <bogus+       8>:	0x00	0x00	0x00	0x00	0x00	0x00	0x00	0x00
<bochs:8> x /24bx 0x8010
[bochs]:
0x0000000000008010 <bogus+       0>:	0x00	0x00	0x00	0x00	0x00	0x00	0x00	0x00
0x0000000000008018 <bogus+       8>:	0x00	0x00	0x00	0x00	0x00	0x99	0x20	0x00
0x0000000000008020 <bogus+      16>:	0x00	0x00	0x00	0x00	0x00	0x93	0x00	0x00
<bochs:9> r
CPU0:
rax: 00000000_00000010
rbx: 00000000_00000000
rcx: 00000000_c0000080
rdx: 00000000_00000000
rsp: 00000000_00007ad6
rbp: 00000000_00000000
rsi: 00000000_000e7d54
rdi: 00000000_0000800e
r8 : 00000000_00000000
r9 : 00000000_00000000
r10: 00000000_00000000
r11: 00000000_00000000
r12: 00000000_00000000
r13: 00000000_00000000
r14: 00000000_00000000
r15: 00000000_00000000
rip: 00000000_0000c01e
eflags 0x00000002: id vip vif ac vm rf nt IOPL=0 of df if tf sf zf af pf cf
VirtualBox logs attached.
Attachments
VBox.log.gz
VirtualBox log
(16.38 KiB) Downloaded 97 times
Octocontrabass
Member
Member
Posts: 5581
Joined: Mon Mar 25, 2013 7:01 pm

Re: VirtualBox "mov es, ax" (or maybe "mov ss, ax"?) faults

Post by Octocontrabass »

sounds wrote:I already set the stack pointer while in real mode to the top of a 30KB stack, and this is the first time that stack is getting used. Could VirtualBox be breaking due to the push instructions, even though real hardware works fine?
It's possible. You're using a real mode stack segment outside of real mode, and VirtualBox may not be handling it gracefully even though segment descriptors should be mostly ignored in 64-bit mode.

Out of curiosity, does VirtualBox still crash if you load the segment registers with null selectors? SS requires a valid segment in ring 3 and when performing some stack switching operations, but all data segments may be null outside those cases.

Code: Select all

CR0=0xe0040033: PG CD NW AC wp NE ET ts em MP PE
That is a strange value to have in CR0. Which of those bits did you set intentionally?

Code: Select all

0x000000000000a000 <bogus+       0>:	0xe3	0x00	0x00	0x00	0x00	0x00	0x00	0x00
0x000000000000a008 <bogus+       8>:	0x00	0x00	0x00	0x00	0x00	0x00	0x00	0x00
It's generally a bad idea to have pages that span effective cache type boundaries. Intel CPUs automatically split 2MB and 4MB pages within the first 4MB to ensure the TLB contains consistent memory type information, but this doesn't apply to larger pages and I'm not sure if AMD CPUs provide the same compatibility hack.

Code: Select all

gdtr:base=0x0000000000008010, limit=0x1d
That's a really weird limit. What's going on there?
sounds wrote:VirtualBox logs attached.
Wow, the log just ends at boot.
sounds
Member
Member
Posts: 112
Joined: Sat Feb 04, 2012 5:03 pm

Re: VirtualBox "mov es, ax" (or maybe "mov ss, ax"?) faults

Post by sounds »

I'll just note that VirtualBox did fine until version 6.0. And tests on an AMD Athlon64 handle this sequence fine.

And a tiny note, I do make BIOS calls while in real mode and with the stack set to that 30KB region, so the stack is empty when the long jump to 64-bit mode is done. But it is inaccurate to say the stack was "never used." My bad.
sounds
Member
Member
Posts: 112
Joined: Sat Feb 04, 2012 5:03 pm

Re: VirtualBox "mov es, ax" (or maybe "mov ss, ax"?) faults

Post by sounds »

Octocontrabass wrote:

Code: Select all

gdtr:base=0x0000000000008010, limit=0x1d
That's a really weird limit. What's going on there?
I've fixed the limit to be 0x1f (i.e. 0x20). That was indeed a bug. Thanks!

Code: Select all

<bochs:1> b 0xc01e
<bochs:2> c
(0) Breakpoint 1, 0x000000000000c01e in ?? ()
Next at t=908928882
(0) [0x00000000c01e] 0008:000000000000c01e (code+1e): mov ss, ax                ; 8ed0
<bochs:3> disasm 0xc000 0xc02d
000000000000c000: (              code+0): push r15                  ; 4157
000000000000c002: (              code+2): push r14                  ; 4156
000000000000c004: (              code+4): push r13                  ; 4155
000000000000c006: (              code+6): push r12                  ; 4154
000000000000c008: (              code+8): push rbp                  ; 55
000000000000c009: (              code+9): push rbx                  ; 53
000000000000c00a: (              code+a): sub rsp, 0x00000000000000d8 ; 4881ecd8000000
000000000000c011: (             code+11): xor rax, rax              ; 4831c0
000000000000c014: (             code+14): push rax                  ; 50
000000000000c015: (             code+15): popf                      ; 9d
000000000000c016: (             code+16): mov ax, 0x0010            ; 66b81000
000000000000c01a: (             code+1a): mov ds, ax                ; 8ed8
000000000000c01c: (             code+1c): mov es, ax                ; 8ec0
000000000000c01e: (             code+1e): mov ss, ax                ; 8ed0
000000000000c020: (             code+20): mov rsp, 0x0000000000007dea ; 48c7c4ea7d0000
000000000000c027: (             code+27): mov fs, ax                ; 8ee0
000000000000c029: (             code+29): mov gs, ax                ; 8ee8
000000000000c02b: (             code+2b): xor eax, eax              ; 31c0
<bochs:4> creg
CR0=0xe0040033: PG CD NW AC wp NE ET ts em MP PE
CR2=page fault laddr=0x0000000000000000
CR3=0x000000008000
    PCD=page-level cache disable=0
    PWT=page-level write-through=0
CR4=0x000007a6: cet pke smap smep osxsave pcid fsgsbase smx vmx OSXMMEXCPT umip OSFXSR PCE PGE mce PAE pse de TSD PVI vme
CR8: 0x0
EFER=0x00000501: ffxsr nxe LMA LME SCE
XCR0=0x00000001: pkru hi_zmm zmm_hi256 opmask bndcfg bndregs ymm sse FPU
<bochs:5> sreg
es:0x0010, dh=0x00009300, dl=0x00000000, valid=1
	Data segment, base=0x00000000, limit=0x00000000, Read/Write, Accessed
cs:0x0008, dh=0x00209900, dl=0x00000000, valid=1
	Code segment, base=0x00000000, limit=0x00000000, Execute-Only, Non-Conforming, Accessed, 64-bit
ss:0x0002, dh=0x00009300, dl=0x0020ffff, valid=7
	Data segment, base=0x00000020, limit=0x0000ffff, Read/Write, Accessed
ds:0x0010, dh=0x00009300, dl=0x00000000, valid=1
	Data segment, base=0x00000000, limit=0x00000000, Read/Write, Accessed
fs:0x0000, dh=0x00009300, dl=0x0000ffff, valid=1
	Data segment, base=0x00000000, limit=0x0000ffff, Read/Write, Accessed
gs:0x0000, dh=0x00009300, dl=0x0000ffff, valid=1
	Data segment, base=0x00000000, limit=0x0000ffff, Read/Write, Accessed
ldtr:0x0000, dh=0x00008200, dl=0x0000ffff, valid=1
tr:0x0000, dh=0x00008b00, dl=0x0000ffff, valid=1
gdtr:base=0x0000000000008010, limit=0x1f
idtr:base=0x0000000000000000, limit=0x0
Then tried it in VirtualBox, and it still aborts. Here are the last 5 lines of VBox.log again showing it ends at boot:

Code: Select all

00:00:01.122588 VMMDev: Guest Log: BIOS: ata0-0: PCHS=20/16/63 LCHS=20/16/63
00:00:01.125089 PIT: mode=2 count=0x48d3 (18643) - 64.00 Hz (ch=0)
00:00:01.135825 PIT: mode=2 count=0x10000 (65536) - 18.20 Hz (ch=0)
00:00:01.136004 VMMDev: Guest Log: BIOS: Boot : bseqnr=1, bootseq=0002
00:00:01.136282 VMMDev: Guest Log: BIOS: Booting from Hard Disk...
Octocontrabass wrote:Out of curiosity, does VirtualBox still crash if you load the segment registers with null selectors? SS requires a valid segment in ring 3 and when performing some stack switching operations, but all data segments may be null outside those cases.
I'm uncomfortable going that route at the moment, maybe I'm not desperate enough yet. I'm curious if this technique - loading SS with a null segment - has been tested on a wide range of hardware.

I like the idea of using inline assembler at file scope. I want to try that next.
Octocontrabass
Member
Member
Posts: 5581
Joined: Mon Mar 25, 2013 7:01 pm

Re: VirtualBox "mov es, ax" (or maybe "mov ss, ax"?) faults

Post by Octocontrabass »

sounds wrote:I'm curious if this technique - loading SS with a null segment - has been tested on a wide range of hardware.
It's a fundamental part of the architecture: some control transfers will set SS to a null selector, so it has to work.

There's a fun edge case where using SYSRET to return to 32-bit compatibility mode on an AMD CPU while SS is null results in an invalid segment in SS. This only affects 32-bit programs on a 64-bit OS on an AMD CPU, so it's easy to miss, but it might not even be a concern for your OS.
MichaelPetch
Member
Member
Posts: 797
Joined: Fri Aug 26, 2016 1:41 pm
Libera.chat IRC: mpetch

Re: VirtualBox "mov es, ax" (or maybe "mov ss, ax"?) faults

Post by MichaelPetch »

Do you have your code in Github or some other similar service? Just an observation - Some hobby OSes stop working because they aren't robust enough to handle cases that some hardware or virtual machines might throw at them. As an example as of late QEMU 7.0+ added a 12GiB reserved memory hole (Hyper Transport related) to its memory map. Some hobby OSes failed because they didn't expect they'd be mapping such a large area and they accidentally allowed other data structures to become corrupt (like page tables, interrupt table, descriptor tables) etc. The failures can manifest themselves in mysterious ways. Since I haven't seen your code I can't say why you are experiencing problems, but sometimes it can come down to edge cases that hobby code might not have anticipated.
sounds
Member
Member
Posts: 112
Joined: Sat Feb 04, 2012 5:03 pm

Re: VirtualBox "mov es, ax" (or maybe "mov ss, ax"?) faults

Post by sounds »

MichaelPetch wrote:Do you have your code in Github or some other similar service? Just an observation - Some hobby OSes stop working because they aren't robust enough to handle cases that some hardware or virtual machines might throw at them. As an example as of late QEMU 7.0+ added a 12GiB reserved memory hole (Hyper Transport related) to its memory map. Some hobby OSes failed because they didn't expect they'd be mapping such a large area and they accidentally allowed other data structures to become corrupt (like page tables, interrupt table, descriptor tables) etc. The failures can manifest themselves in mysterious ways. Since I haven't seen your code I can't say why you are experiencing problems, but sometimes it can come down to edge cases that hobby code might not have anticipated.
The fault happens early:
sounds wrote:It's aborting the VM in the very first function after the jump to long mode.
No, not putting it on github, the code is in this thread. Thank you though for suggesting other interesting areas to check!
Last edited by sounds on Sun Dec 25, 2022 11:31 pm, edited 1 time in total.
sounds
Member
Member
Posts: 112
Joined: Sat Feb 04, 2012 5:03 pm

Re: VirtualBox "mov es, ax" (or maybe "mov ss, ax"?) faults

Post by sounds »

Octocontrabass wrote:It's a fundamental part of the architecture: some control transfers will set SS to a null selector, so it has to work.

There's a fun edge case where using SYSRET to return to 32-bit compatibility mode on an AMD CPU while SS is null results in an invalid segment in SS. This only affects 32-bit programs on a 64-bit OS on an AMD CPU, so it's easy to miss, but it might not even be a concern for your OS.
Interesting, today I learned something. See below.

VirtualBox still aborts. Here's bochs showing what is executed -

Code: Select all

<bochs:1> b 0xc021
<bochs:2> c
(0) Breakpoint 1, 0x000000000000c021 in ?? ()
Next at t=908916730
(0) [0x00000000c021] 0008:000000000000c021 (code+21): mov ss, ax                ; 8ed0
<bochs:3> disasm 0xc000 0xc02e
000000000000c000: (              code+0): push r15                  ; 4157
000000000000c002: (              code+2): push r14                  ; 4156
000000000000c004: (              code+4): push r13                  ; 4155
000000000000c006: (              code+6): push r12                  ; 4154
000000000000c008: (              code+8): push rbp                  ; 55
000000000000c009: (              code+9): push rbx                  ; 53
000000000000c00a: (              code+a): sub rsp, 0x00000000000000d8 ; 4881ecd8000000
000000000000c011: (             code+11): xor rax, rax              ; 4831c0
000000000000c014: (             code+14): push rax                  ; 50
000000000000c015: (             code+15): popf                      ; 9d
000000000000c016: (             code+16): mov ax, 0x0010            ; 66b81000
000000000000c01a: (             code+1a): mov ds, ax                ; 8ed8
000000000000c01c: (             code+1c): mov es, ax                ; 8ec0
000000000000c01e: (             code+1e): xor ax, ax                ; 6631c0
000000000000c021: (             code+21): mov ss, ax                ; 8ed0
000000000000c023: (             code+23): mov rsp, 0x0000000000007dea ; 48c7c4ea7d0000
000000000000c02a: (             code+2a): mov fs, ax                ; 8ee0
000000000000c02c: (             code+2c): mov gs, ax                ; 8ee8
<bochs:4> creg
CR0=0xe0040033: PG CD NW AC wp NE ET ts em MP PE
CR2=page fault laddr=0x0000000000000000
CR3=0x000000008000
    PCD=page-level cache disable=0
    PWT=page-level write-through=0
CR4=0x000007a6: cet pke smap smep osxsave pcid fsgsbase smx vmx OSXMMEXCPT umip OSFXSR PCE PGE mce PAE pse de TSD PVI vme
CR8: 0x0
EFER=0x00000501: ffxsr nxe LMA LME SCE
XCR0=0x00000001: pkru hi_zmm zmm_hi256 opmask bndcfg bndregs ymm sse FPU
<bochs:5> sreg
es:0x0010, dh=0x00009300, dl=0x00000000, valid=1
	Data segment, base=0x00000000, limit=0x00000000, Read/Write, Accessed
cs:0x0008, dh=0x00209900, dl=0x00000000, valid=1
	Code segment, base=0x00000000, limit=0x00000000, Execute-Only, Non-Conforming, Accessed, 64-bit
ss:0x0002, dh=0x00009300, dl=0x0020ffff, valid=7
	Data segment, base=0x00000020, limit=0x0000ffff, Read/Write, Accessed
ds:0x0010, dh=0x00009300, dl=0x00000000, valid=1
	Data segment, base=0x00000000, limit=0x00000000, Read/Write, Accessed
fs:0x0000, dh=0x00009300, dl=0x0000ffff, valid=1
	Data segment, base=0x00000000, limit=0x0000ffff, Read/Write, Accessed
gs:0x0000, dh=0x00009300, dl=0x0000ffff, valid=1
	Data segment, base=0x00000000, limit=0x0000ffff, Read/Write, Accessed
ldtr:0x0000, dh=0x00008200, dl=0x0000ffff, valid=1
tr:0x0000, dh=0x00008b00, dl=0x0000ffff, valid=1
gdtr:base=0x0000000000008010, limit=0x1f
idtr:base=0x0000000000000000, limit=0x0
So I tried a bunch of tests to break something, to see if booting with SS set to the null segment for the early bootup caused bugs, and everything ... worked fine. =D>

I mean, everything except VirtualBox.
Octocontrabass
Member
Member
Posts: 5581
Joined: Mon Mar 25, 2013 7:01 pm

Re: VirtualBox "mov es, ax" (or maybe "mov ss, ax"?) faults

Post by Octocontrabass »

sounds wrote:VirtualBox still aborts. Here's bochs showing what is executed
You should find a way to check this stuff from within VirtualBox instead of using Bochs. There could be a bug elsewhere in your code that only causes problems in VirtualBox.

Code: Select all

CR0=0xe0040033: PG CD NW AC wp NE ET ts em MP PE
Why is this the value of CR0?
sounds
Member
Member
Posts: 112
Joined: Sat Feb 04, 2012 5:03 pm

Re: VirtualBox "mov es, ax" (or maybe "mov ss, ax"?) faults

Post by sounds »

Octocontrabass wrote:
sounds wrote:VirtualBox still aborts. Here's bochs showing what is executed
You should find a way to check this stuff from within VirtualBox instead of using Bochs. There could be a bug elsewhere in your code that only causes problems in VirtualBox.
I already answered this (the VirtualBox debugger freezes).
Octocontrabass wrote:

Code: Select all

CR0=0xe0040033: PG CD NW AC wp NE ET ts em MP PE
Why is this the value of CR0?
I don't believe it's the cause of the bug, but I'm happy to hear your suggestions on what bits you would change and why.
Octocontrabass
Member
Member
Posts: 5581
Joined: Mon Mar 25, 2013 7:01 pm

Re: VirtualBox "mov es, ax" (or maybe "mov ss, ax"?) faults

Post by Octocontrabass »

sounds wrote:
Octocontrabass wrote:

Code: Select all

CR0=0xe0040033: PG CD NW AC wp NE ET ts em MP PE
Why is this the value of CR0?
I don't believe it's the cause of the bug, but I'm happy to hear your suggestions on what bits you would change and why.
I think you should flip CD, NW, and WP.

You should clear CD to enable caches. If you need to disable caches, you can use page-level cache controls or MTRRs instead.

You should clear NW to enable cache coherency. I can't think of any reason why any OS would ever need incoherent caches, but this bit is often set alongside CD when caches are disabled.

You should set WP to enable page write protection in ring 0. Typically you want a page fault when your kernel writes a read-only page; that way you can perform automatic copy-on-write in ring 0 without constantly looking at your page tables, and possibly also catch bugs that cause stray writes to read-only pages.

I don't see how these could fix the bug, but with nothing useful in the VirtualBox log, it could really be anything you're doing that doesn't line up with an ordinary OS. (And speaking of an ordinary OS, have you checked to make sure VirtualBox can run something like Windows or Linux?)
Post Reply