nullplan wrote:Shouldn't you have 32-bit and 64-bit code in different places? In different tasks with different stacks?
thewrongchristian wrote:What are you actually trying to achieve? Are you mixing 32-bit and 64-bit kernel code?
Octocontrabass wrote:Why do you want to switch between the two modes?
Long story short, my kernel is 32bit and builds under a 32bit toolchain. For experimental support of long mode, I added a thin shim that mostly enables the kernel to handle events vectored through the IDT.
I also plan to support a few more instructions, such as wider integer mul/div, using the same shim. Should be a nice speed up from software emulation of these operations.
That's why there's a need to switch modes, and hopefullly fast, this is not for changing PL or task switch.
Octocontrabass wrote:You're doing something wrong if you need hacks like this to get your assembler to generate the opcodes you want. Perhaps you forgot a .code64 directive or "q" suffix?
Yes this is not how it's supposed to be done, had to bend the toolchain slightly backwards in order to generate the needed code. I know it doesn't support the q suffix, but need to try to see whether it accepts .code64 or not, not holding my breath though.
Octocontrabass wrote: inconvenient limitations (as I'm sure you've already noticed), including one limitation that's specific to AMD CPUs.
Could you share the limitations? I'm very new to 64bit and aside from having to switch mode (and restore SS for IDT events), the only obvious limitation I noticed is that VMX instructions cause UD in compat mode, yet they do seem to work if I use the shim around them so it seems doubtful that there's any good reason for the limitation.
thewrongchristian wrote:If this is to handle 32-bit processes, I don't understand why you'd need 32-bit kernel code, other than some glue code to translate 32-bit syscalls to native 64-bit handlers (the glue itself can be 64-bit native code.)
I might attempt the reverse later, it seems feasible if I set up a 64bit user CS and iret to it. That way, the user space apps can access the extra GPRs and XMMs, even though they'd still be limited to the 4GB address space.