SYSENTER causing invalid opcode fault in VirtualBox

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
rwosdev
Member
Member
Posts: 49
Joined: Thu Apr 26, 2018 11:21 pm

SYSENTER causing invalid opcode fault in VirtualBox

Post by rwosdev »

First off, my computer's old and doesn't support hardware virtualization, so I assume VirtualBox is using software virtualization/emulation (and I don't have plans to upgrade soon)

The issue is my sysenter handler works and returns an unlimited number of times without issue in QEMU and Bochs, but fails to even be called in VirtualBox. I even tried panic-ing in the sysenter handler to indicate it's at least called once, works in QEMU and Bochs but still nothing in VirtualBox, indicating it's not even reached.

Is this a known issue? I can't find any info suggesting it is

Thanks

Code: Select all

UserTask:
	mov eax, 0
.loop:
	inc eax
	cmp eax, 8000000
	jl .loop
	
	mov ecx, esp
	mov edx, .ok
	sysenter
.ok:
	cmp eax, 9000000
	jne .loop
	
	jmp UserTask

Code: Select all

SysEnterHandler:
	push ebx
	push ecx
	push edx
	
	PANIC "HELLO"
	
	pop edx
	pop ecx
	pop ebx

	sti
	sysexit
rwosdev
Member
Member
Posts: 49
Joined: Thu Apr 26, 2018 11:21 pm

Re: SYSENTER causing invalid opcode fault in VirtualBox

Post by rwosdev »

UPDATE I ran OllyDbg in a Windows XP VM, opened a program and set EAX = 1, and changed the first instruction to CPUID. When run, bit 11 (SEP) in EDX is actually 0, indicating that sysenter is not enabled/supported on VirtualBox. EDX = 0xBFEBFBFF for those interested.

I can use OllyDbg to assemble and call int 0x2E in XP fine, but using OllyDbg to assemble sysenter results in invalid opcode

Is it possible to force enable it?
User avatar
iansjack
Member
Member
Posts: 4706
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: SYSENTER causing invalid opcode fault in VirtualBox

Post by iansjack »

If the software emulator in VirtualBox doesn't support SYSENTER then I don't see that there is any way to force it to do so (other than rewriting the emulator). It's open source, so you could always have a go.

Fortunately you have two other solutions, so I think you have to accept that your code won't run on VirtualBox.
rwosdev
Member
Member
Posts: 49
Joined: Thu Apr 26, 2018 11:21 pm

Re: SYSENTER causing invalid opcode fault in VirtualBox

Post by rwosdev »

Those solutions being, A. ) Only use interrupts or B. ) Do a stub using CPUID to deduce whether to run an interrupt or a sysenter, both of which run handlers to call the same code to process the syscall, correct?

Btw, do you think performance would be better running CPUID on every call or by storing the result locally and checking that each time?
User avatar
iansjack
Member
Member
Posts: 4706
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: SYSENTER causing invalid opcode fault in VirtualBox

Post by iansjack »

Or C) use qemu rather than VirtualBox.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: SYSENTER causing invalid opcode fault in VirtualBox

Post by Brendan »

Hi,
rwosdev wrote:Btw, do you think performance would be better running CPUID on every call or by storing the result locally and checking that each time?
It depends on the situation. CPUID is serialising and wipes out 4 registers (so in some cases you might need to save these registers and reload them after); but in extreme cases a memory access can be a TLB miss (involving many cache misses) followed by a cache miss.

However; the information returned by CPUID is full of quirks and bugs and missing information. For example, for SYSENTER alone, there's an Intel CPU that says it's supported when it isn't, and if the flag is set then it could mean "only supported in 32-bit code" (AMD CPUs) or "supported in both 32-bit and 64-bit code" (Intel CPUs). If you care about more than one specific little thing it becomes a nightmare.

For this reason, during boot I'd have code that builds your own "CPU information structure" from the raw/dodgy data that CPUID returns (with a large quantity of checks, plus static tables to obtain information that CPUID lacks from the "vendor:family:model:stepping" information). This might include:
  • Getting feature flags from all over the place (from Intel's area, from AMDs area, from Centaur/VIA's area, ...), correcting them, splitting them up where necessary (e.g. have a "SYSENTER32" and a "SYSENTER64" flag instead of a single flag), etc.
  • Getting cache characteristics and topology information into a nice sanity checked standard format (rather than the "multiple different methods for each different vendor" nonsense)
  • Converting "family:model:stepping" into a form that isn't an extremely stupid joke ("all CPUs from the last 2 whole decades are the same family=6 because of anti-competitive hacks we put in our compiler")
  • Getting a brand string, including determining what it might be from "vendor:family:model:stepping" tables on older CPUs that don't give you one, including replacing worthless information with useful information (e.g. replace "xeon" with "xeon (sandy bridge)" so that people have some clue what the CPU actually is without searching the Internet for "E3-1234"), and including stripping any ugly white-space/padding.
  • Converting "vendor string" into something readable (e.g. "Intel Coorporation" rather than "GenuineIntel", "Rise Technologies" rather than "RiseRiseRise", ...), plus maybe an integer/enum that software can use efficiently (e.g. 0 = unknown, 1 = Intel, 2 = AMD, 3 = VIA, ...)
  • Building a set of flags for things that kernel needs to work around (F00F, coma, spectre, meltdown, accessed/dirty flags in page tables, ...)
  • Building a set of flags for potential problems that kernel can't work around (as a way to warn user)
  • Adding any information you might want that CPUID never returns (e.g. maybe socket type, TDP, year of first release)
Then; after you've spend months writing and testing all of this, you'd ban the use of the CPUID instruction so that software/applications written today don't fail next week when Intel release a CPU that says something else is supported when it isn't.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
rwosdev
Member
Member
Posts: 49
Joined: Thu Apr 26, 2018 11:21 pm

Re: SYSENTER causing invalid opcode fault in VirtualBox

Post by rwosdev »

Wow Brendan! It's a shame how they could mess up CPUID so badly
User avatar
Schol-R-LEA
Member
Member
Posts: 1925
Joined: Fri Oct 27, 2006 9:42 am
Location: Athens, GA, USA

Re: SYSENTER causing invalid opcode fault in VirtualBox

Post by Schol-R-LEA »

rwosdev wrote:Wow Brendan! It's a shame how they could mess up CPUID so badly
Well, one thing to keep in mind is that CPUID is something which you would normally only run once, at boot time (or better still, IMAO, at installation time, with it being only re-run at boot if a boot-time sanity check indicates that the hardware has changed), and for the reasons given by Brendan, it is never taken as definitive on its own.

The solution I would recommend is to wrap the system calls in a userland shared/dynamic library, which the system would configure OaOO to use the correct form as needed. The library routines wouldn't need to repeatedly test the CPU type, either; rather, the correct version of the library would be selected at boot time (or, again, even at install time). I don't know if this would be practical for you at this stage, though.

A simpler way might be for the kernel to pass a flag or struct of flags to each process when they are launched containing the information needed to use the right system calls (among other things). Since the OS needs to be able to pass some data to the process anyway (e.g., PID, shell arguments, etc.) this should be a straightforward extension.

You mentioned that your current dev/testing host doesn't support hardware virtualization (whether Intel's VT-x or AMD's AMD-V, and the various sub-categories to each). If you don't mind me asking, what hardware are you using (CPU, specifically, though mobo/system might also help)? It might give us a better idea of what to suggest.

Note that 'hardware virtualization' is itself a bit of a can of worms. The original Intel VT-x and AMD-V virtualization (from 2005 and 2006, respectively) only apply to virtualizing the CPU instructions, and not all CPU models since have supported it (according to Wicked-Pedo, both companies only put it in top-tier CPUs at first, but by 2015 all but some low-end Atom processors had it). Different extensions applying to interrupts, GPU access, I/O paths, etc. were added later, and they keep changing them over time even now.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: SYSENTER causing invalid opcode fault in VirtualBox

Post by Brendan »

Hi,
rwosdev wrote:Wow Brendan! It's a shame how they could mess up CPUID so badly
It's a combination of 3 or 4 different problems:
  • Human error. Everyone makes mistakes (including CPU manufacturers), so sometimes what should happen doesn't. One example of this is the "physical address bits" field - Intel knew exactly what it should be (because this came from AMD and there were specifications before Intel adopted it), but they still managed to get it wrong in at least one of their CPUs.
  • Lack of foresight. A good example of this is the way CPUID reports cache characteristics on Intel CPUs - they failed to realise that eventually people would want cache information (and started with no information); then failed to realise that eventually there'd be multi-core CPUs that share caches (and used an awful enumeration approach in "CPUID, eax=0x00000002"); and then added "CPUID, eax=0x00000004" after.
  • Lack of standardisation and cooperation between vendors. The way CPUID reports cache characterstics is a good example of this too - AMD refused to cooperate and use Intel's awful enumeration thing so they used "CPUID, eax=0x80000005 and eax=0x80000006" (which is a lot less sucky, but still failed to foresee shared caches in multi-core), then Intel refused to cooperate and support AMD's "CPUID, eax=0x80000005 and eax=0x80000006". Ideally you'd have a "trusted neutral third party" organisation creating standards that all vendors follow (instead of "tug of war with consumers/developers as the rope").
  • Backward compatibility with no "deprecation policy". Essentially, they keep adding new stuff and aren't able to remove old stuff, so the amount of junk increases forever. If they had some kind of official policy; like "new features are guaranteed to be supported for a minimum of 10 years before being deprecated/superseded; and will continue to be supported for a minimum of 5 years after they have been deprecated"; then they'd get almost all of the backward compatibility advantages but would also be able to prevent "amount of junk increases forever".

Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Post Reply