Running 64-bit code in 32-bit x86 OS

Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
rdos
Member
Member
Posts: 3276
Joined: Wed Oct 01, 2008 1:55 pm

Running 64-bit code in 32-bit x86 OS

Post by rdos »

Is there a convinient way of running some 64-bit executables with a 32-bit kernel? When AMD designed x86-64, they assumed that the kernel is 64-bit, and that it uses a sub-mode for running 32-bit code. However, that is not a possible option in this case as I don't want a 64-bit kernel that runs protected mode applications in an "emulation mode".

IOW, when kernel code or 32-bit applications are running, the system should be in 32-bit mode, and syscalls should enter a 32-bit kernel. When an 64-bit application is running, the system must be setup in 64-bit mode, and needs to switch to 32-bit mode when syscalls are executed (in addition to some pointer relocations that might be needed and that can be handled with paging).

The issue is if it is feasible to switch between 32-bit and 64-bit mode relatively frequently? A special problem is IRQs, that in 64-bit mode will need to execute some dummy 64-bit IRQ stub which switches to 32-bit mode and executes the correct 32-bit IRQ (later some IRQs might provide bimodal handlers, but initially switching to 32-bit mode will be needed).

So, what is the overhead of switching from 64-bit mode to 32-bit mode and the reverse? What exactly would need to be reloaded? The IDTR would be obvious, but how about GDTR (shouldn't be needed as 64-bit mode is non-segmented)? Also, there would be a need to reload CR3 to change paging, but 32-bit and 64-bit paging could live more or less side-by-side, with 64-bit paging just mapping the kernel like the 32-bit paging does (which would be static), and mapping it's own userland space.
linguofreak
Member
Member
Posts: 510
Joined: Wed Mar 09, 2011 3:55 am

Re: Running 64-bit code in 32-bit x86 OS

Post by linguofreak »

I don't know details, but OS X 10.5 apparently introduced general 64-bit application support, whereas a 64-bit kernel did not appear before 10.6.

Also, although it's probably not quite what you're looking for, CPU's with VT-x can run 64-bit kernels in virtualization under a 32-bit host kernel.
User avatar
turdus
Member
Member
Posts: 496
Joined: Tue Feb 08, 2011 1:58 pm

Re: Running 64-bit code in 32-bit x86 OS

Post by turdus »

To support 64 bit userspace, you'll need a 64 bit kernel, that's for sure. Good news is, it can run 32 bit apps as well, with a really minimal overhead. With a little precaution, the same syscall can work with both 32 bit and 64 bit applications (kernel is in long mode, sysret knows whether to switch to compatibility mode on return or not). You can make it work with int syscalls too, but you have to code a lot: before iret, you have to examine the stack, get cs selector and look up it in gdt to see if it's a 32 bit or 64 bit segment, and return accordingly. In this case overhead can be notable.
User avatar
bluemoon
Member
Member
Posts: 1761
Joined: Wed Dec 01, 2010 3:41 am
Location: Hong Kong

Re: Running 64-bit code in 32-bit x86 OS

Post by bluemoon »

rdos wrote:Is there a convinient way of running some 64-bit executables with a 32-bit kernel? When AMD designed x86-64, they assumed that the kernel is 64-bit, and that it uses a sub-mode for running 32-bit code. However, that is not a possible option in this case as I don't want a 64-bit kernel that runs protected mode applications in an "emulation mode".
Compatibility mode is not emulation, the instruction is executed as-is, however the backend for paging, GDT/IDT/TSS etc are all 64-bit long mode format.
rdos wrote:IOW, when kernel code or 32-bit applications are running, the system should be in 32-bit mode, and syscalls should enter a 32-bit kernel. When an 64-bit application is running, the system must be setup in 64-bit mode, and needs to switch to 32-bit mode when syscalls are executed (in addition to some pointer relocations that might be needed and that can be handled with paging).
True. However things may get speed up by using syscall instruction, which the kernel.CS (32-bit) is set on MSR and don't need to be validated every time.
rdos wrote:The issue is if it is feasible to switch between 32-bit and 64-bit mode relatively frequently? A special problem is IRQs, that in 64-bit mode will need to execute some dummy 64-bit IRQ stub which switches to 32-bit mode and executes the correct 32-bit IRQ (later some IRQs might provide bimodal handlers, but initially switching to 32-bit mode will be needed).
I couldn't guess the performance but I think IRQ is not triggered too frequently, and the core IRQ routine is short and can be written native in 64-bit anyway.
rdos wrote:So, what is the overhead of switching from 64-bit mode to 32-bit mode and the reverse? What exactly would need to be reloaded? The IDTR would be obvious, but how about GDTR (shouldn't be needed as 64-bit mode is non-segmented)? Also, there would be a need to reload CR3 to change paging, but 32-bit and 64-bit paging could live more or less side-by-side, with 64-bit paging just mapping the kernel like the 32-bit paging does (which would be static), and mapping it's own userland space.
You just need to reload CS to enter/leave compatibility mode. GDT, IDT, Page Tables (maybe more) are always in 64-bit format in long mode and compatibility mode. This means you would need some 64-bit wrapper(or rewrite) on some critical code(ie. ISR, paging code)
rdos
Member
Member
Posts: 3276
Joined: Wed Oct 01, 2008 1:55 pm

Re: Running 64-bit code in 32-bit x86 OS

Post by rdos »

Reading up a little more about paging, I think it might be more convinient to use PAE-paging on systems with 64-bit support. That's because the page tables of PAE are similar (virtually identical) to the IA32e version, so this means 64-bit code can reference data in the 32-bit kernel without having two copies of the page-tables. IOW, the first step would be to add support for PAE-paging to the existing 32-bit kernel.

Next, with the ability of 64-bit code to access data in the 32-bit kernel, it might be more convient to just switch to "legacy mode" from 64-bit IRQs rather than do the full mode-switch. Legacy mode runs with the 64-bit page tables, so the only thing needed is new 64-bit IRQ stubs that does far calls to legacy interrupt handlers, much the same as the 32-bit version. That should take care of the IRQ issue.

Handling syscalls shouldn't be too problematic either. It would use the SYSCALL mechanism that already has support in my 32-bit kernel (witrh SYSENTER). Once in kernel, the correct handler is invoked with a far-jmp to "legacy mode", much the same way it already works.

With this new design, a 64-bit process could always run in 64-bit mode, with syscalls and IRQs being chained to legacy-mode with minimal overhead. The switch between modes could instead be done in the scheduler. The scheduler will set the correct mode based on the process bitness. This makes the switch less critical.

The most interesting aspect is that I can still target V86 mode without emulation, as V86 mode is available only when the processor is in 32-bit mode.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Running 64-bit code in 32-bit x86 OS

Post by Brendan »

Hi,
rdos wrote:Is there a convinient way of running some 64-bit executables with a 32-bit kernel?
That depends on how you define "convenient"... ;)
rdos wrote:IOW, when kernel code or 32-bit applications are running, the system should be in 32-bit mode, and syscalls should enter a 32-bit kernel. When an 64-bit application is running, the system must be setup in 64-bit mode, and needs to switch to 32-bit mode when syscalls are executed (in addition to some pointer relocations that might be needed and that can be handled with paging).
For long mode, all interrupt handlers (including IRQ handlers, IPI handlers, exception handlers and software interrupt handlers) must point to 64-bit code (or 64-bit stubs that internally call 32-bit handlers); and the IDT and IDT descriptors must also be different to suit long mode. Despite this; you could have a set of 64-bit "stubs" that do nothing more than call 32-bit code to do the real work.

The other problem with running a 32-bit kernel (via. 64-bit stubs) in long mode is that paging may be "different". However, if the existing 32-bit kernel uses PAE, paging will only be a little bit different (e.g. most of the time the 32-bit kernel could just ignore the PML4 and most of one PDPT). There's no reason that 32-bit code can't change CR3 (as long as the physical address of the PML4 is in the first 4 GiB of the physical address space).

In addition to this, you'd need a to have an extra "64-bit code" descriptor in the GDT. This shouldn't be hard to add to a 32-bit kernel's GDT anyway.

There are also some restrictions - mostly it won't work if the 32-bit kernel wants to use virtual80x86 mode or hardware task switching.
rdos wrote:The issue is if it is feasible to switch between 32-bit and 64-bit mode relatively frequently?
In protected mode, the CS descriptor in the GDT or LDT determines if the code is 16-bit or 32-bit. In long mode it's very similar - the CS descriptor in the GDT or LDT determines if the code is 16-bit, 32-bit or 64-bit. Switching from 64-bit to 32-bit (or 16-bit) can be done simply by changing CS. Segment loads aren't fast (compared to normal instructions like "ADD", etc), so it may or may not be feasible depending on how frequently you do it and what sort of performance is acceptable.

Note: You would *not* reload IDTR when switching between 64-bit and 32-bit. You'd always use the same 64-bit IDT (but might have a "32-bit interrupt handler table" in RAM that the 64-bit interrupts handlers/stubs use to determine which 32-bit code to pass control to).

Of course all of the above assumes that you want to switch between 64-bit and 32-bit (which is what you asked), while staying in long mode all the time. If you actually want to switch between long mode and protected mode (which is not what you asked), then it's a bad idea (too many unsolvable race conditions - e.g. there's no way to change CPU modes, GDT, IDT and NMI handler in a safe/atomic way).


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Running 64-bit code in 32-bit x86 OS

Post by Brendan »

Hi,
rdos wrote:The most interesting aspect is that I can still target V86 mode without emulation, as V86 mode is available only when the processor is in 32-bit mode.
Erm, no.

V86 is only available in protected mode (and it doesn't matter if the protected mode code is 16-bit or 32-bit). V86 is not available in long mode (and it doesn't matter if the long mode code is 16-bit or 32-bit or 64-bit).


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
linguofreak
Member
Member
Posts: 510
Joined: Wed Mar 09, 2011 3:55 am

Re: Running 64-bit code in 32-bit x86 OS

Post by linguofreak »

turdus wrote:To support 64 bit userspace, you'll need a 64 bit kernel, that's for sure.
From looking around on Google, I'm pretty sure that OS X is a counterexample to this, being able to run a userspace of either address-width under a kernel of either address-width.

That said, I'm pretty sure it remains in long mode the entire time, switching between the 64-bit and compatibility sub-modes. From what I know of rdos's project, though, I think it's fairly likely he'll need full protected mode (ISTR that he uses non-zero-based segments), and I don't think he'll get acceptable performance switching between legacy mode and long mode without virtualization. (Even with virtualization, I'm not familiar enough with the subject to have a good handle on how much overhead you get task-switching into and out of virtualization).
rdos
Member
Member
Posts: 3276
Joined: Wed Oct 01, 2008 1:55 pm

Re: Running 64-bit code in 32-bit x86 OS

Post by rdos »

turdus wrote:To support 64 bit userspace, you'll need a 64 bit kernel, that's for sure.
I'm not so sure about that. :mrgreen:
turdus wrote:Good news is, it can run 32 bit apps as well, with a really minimal overhead.
Not at all. My future dream-model of applications is to switch from flat to a small 32-bit compact memory model, which has superior protection mechanisms. The alternative model is a 64-bit application that allocates random virtual addresses spread all around the address-space.

With a 64-bit kernel, segment protection cannot be enforced in kernel, so it is not a feasible solution for me.
turdus wrote:With a little precaution, the same syscall can work with both 32 bit and 64 bit applications (kernel is in long mode, sysret knows whether to switch to compatibility mode on return or not). You can make it work with int syscalls too, but you have to code a lot: before iret, you have to examine the stack, get cs selector and look up it in gdt to see if it's a 32 bit or 64 bit segment, and return accordingly. In this case overhead can be notable.
With a register-based syscall convention, the only issue is with pointers. Those either needs to converted to a linear addresses (64-bit kernel), or by mapped to a 32-bit address space (32-bit kernel). The penalty of those seem rather similar. If PAE paging is used, full support for 52-bit physical addresses is present.
rdos
Member
Member
Posts: 3276
Joined: Wed Oct 01, 2008 1:55 pm

Re: Running 64-bit code in 32-bit x86 OS

Post by rdos »

Brendan wrote:For long mode, all interrupt handlers (including IRQ handlers, IPI handlers, exception handlers and software interrupt handlers) must point to 64-bit code (or 64-bit stubs that internally call 32-bit handlers); and the IDT and IDT descriptors must also be different to suit long mode. Despite this; you could have a set of 64-bit "stubs" that do nothing more than call 32-bit code to do the real work.
The most problematic "feature" of long mode is that you cannot use call gates to directly transfer control from a 32-bit application to the destination handler in the 32-bit kernel. This is still the most efficient method on many processors. That's why I want the processor to run in protected mode, and not in long mode, when the application is 32-bit. IRQs actually seem to incur no penalty at all since they can chain to 32-bit code much the same way existing code already do.
Brendan wrote:The other problem with running a 32-bit kernel (via. 64-bit stubs) in long mode is that paging may be "different". However, if the existing 32-bit kernel uses PAE, paging will only be a little bit different (e.g. most of the time the 32-bit kernel could just ignore the PML4 and most of one PDPT). There's no reason that 32-bit code can't change CR3 (as long as the physical address of the PML4 is in the first 4 GiB of the physical address space).
It doesn't use PAE, but I could probably add this to the existing code. I still want to support non-PAE mode though, as not all processors support PAE. An additional benefit of PAE is that I could use more than 4G of physical memory.
Brendan wrote:There are also some restrictions - mostly it won't work if the 32-bit kernel wants to use virtual80x86 mode or hardware task switching.
V86 mode actually works since I plan to run non-64-bit applications in protected mode, and not in long mode. I no longer use hardware task switching other than for double fault. BTW, V86 mode support is essential for being able to switch video-modes, which is a feature I like to keep. At least until all video BIOSes support something better than VBE.
Brendan wrote:Note: You would *not* reload IDTR when switching between 64-bit and 32-bit. You'd always use the same 64-bit IDT (but might have a "32-bit interrupt handler table" in RAM that the 64-bit interrupts handlers/stubs use to determine which 32-bit code to pass control to).
I would when I run 32-bit applications because of the lack of proper functionality of call-gates and interrupts in long mode. The 64-bit application must use SYSCALL to enter kernel, but the 32-bit (or 16-bit) application can still use call-gates if that is faster on the particular processor. As you might know, there is no real advantage in my kernel of not reloading segment registers as the kernel ultimately needs to reload these anyway.
rdos
Member
Member
Posts: 3276
Joined: Wed Oct 01, 2008 1:55 pm

Re: Running 64-bit code in 32-bit x86 OS

Post by rdos »

Brendan wrote:Of course all of the above assumes that you want to switch between 64-bit and 32-bit (which is what you asked), while staying in long mode all the time. If you actually want to switch between long mode and protected mode (which is not what you asked), then it's a bad idea (too many unsolvable race conditions - e.g. there's no way to change CPU modes, GDT, IDT and NMI handler in a safe/atomic way).
Why would there be a need to change CPU modes in an atomic way? If this is done by the scheduler as it schedules a new task, there are no atomic issues. NMI might be a problem, but as it is infrequently used for anything significant, I'll just ignore that issue. The mode switch would simply execute with interrupt disabled.

But I really don't lilke that AMD defined the state diagram for paging so that a switch between PAE and IA32e mode needs to go through disabling paging, but that is a solvable issue if the code is copied to a identity-mapped region, and the mode switch is carried out there. I've solved similar issues in the boot-strap code of AP cores.

Also, AFAIK, there should be no issue with one core running in long mode and another in protected mode.
rdos
Member
Member
Posts: 3276
Joined: Wed Oct 01, 2008 1:55 pm

Re: Running 64-bit code in 32-bit x86 OS

Post by rdos »

An additional complication is to provide new fault-handlers for long mode that can handle faults in kernel in a transparent way. That doesn't look like a trivial issue given the differences in the stack-layout between long mode and protected mode handlers. The most attractive option probably is to recreate the stack-layout of the protected mode handlers, and then just chain to them. The only exception being the pagefault handler which needs to be truely bimodal.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Running 64-bit code in 32-bit x86 OS

Post by Brendan »

Hi,
rdos wrote:
Brendan wrote:For long mode, all interrupt handlers (including IRQ handlers, IPI handlers, exception handlers and software interrupt handlers) must point to 64-bit code (or 64-bit stubs that internally call 32-bit handlers); and the IDT and IDT descriptors must also be different to suit long mode. Despite this; you could have a set of 64-bit "stubs" that do nothing more than call 32-bit code to do the real work.
The most problematic "feature" of long mode is that you cannot use call gates to directly transfer control from a 32-bit application to the destination handler in the 32-bit kernel. This is still the most efficient method on many processors. That's why I want the processor to run in protected mode, and not in long mode, when the application is 32-bit. IRQs actually seem to incur no penalty at all since they can chain to 32-bit code much the same way existing code already do.
For long mode; call gates must switch to 64-bit code; interrupts must switch to 64-bit code and instructions like SYSCALL/SYSENTER must switch to 64-bit code. The penalty of having stubs that switch back to 32-bit code is identical in all of these cases. It's a generic penalty (e.g. the penalty of not having a 64-bit kernel) that can't be avoided regardless of what you do (unless you have a 64-bit kernel).
rdos wrote:
Brendan wrote:There are also some restrictions - mostly it won't work if the 32-bit kernel wants to use virtual80x86 mode or hardware task switching.
V86 mode actually works since I plan to run non-64-bit applications in protected mode, and not in long mode. I no longer use hardware task switching other than for double fault. BTW, V86 mode support is essential for being able to switch video-modes, which is a feature I like to keep. At least until all video BIOSes support something better than VBE.
Being able to switching video modes (after boot) is not essential, and no amount of V86 is going to help for modern UEFI systems anyway.

Of course nothing prevents you from implementing or porting a real mode emulator and continuing to use VBE (if it's actually available) in long mode.
rdos wrote:
Brendan wrote:Note: You would *not* reload IDTR when switching between 64-bit and 32-bit. You'd always use the same 64-bit IDT (but might have a "32-bit interrupt handler table" in RAM that the 64-bit interrupts handlers/stubs use to determine which 32-bit code to pass control to).
I would when I run 32-bit applications because of the lack of proper functionality of call-gates and interrupts in long mode. The 64-bit application must use SYSCALL to enter kernel, but the 32-bit (or 16-bit) application can still use call-gates if that is faster on the particular processor. As you might know, there is no real advantage in my kernel of not reloading segment registers as the kernel ultimately needs to reload these anyway.
Switching between long mode and protected mode means completely destroying TLBs and reloading almost everything (TSS, IDT, all segment registers, etc). Switching back again is equally expensive. For 64-bit applications; the total cost of this (including TLB misses, etc and not just the switch itself) is going to be several thousand cycles for every system call, IRQ and exception. Running 32-bit code (e.g. the legacy kernel) in long mode and completely avoiding the massive amount of overhead involved with switching CPU modes is likely to be many orders of magnitude faster.
rdos wrote:
turdus wrote:To support 64 bit userspace, you'll need a 64 bit kernel, that's for sure.
I'm not so sure about that. :mrgreen:
Strangely, RDOS is right - it is entirely possible to have a 64-bit user space without a 64-bit kernel. However...

There are 2 main reasons for applications to use 64-bit. The first reason is that the application needs (or perhaps only benefits from rather than needing) the extra virtual address space. RDOS's 32-bit kernel probably won't be able to handle "greater than 4 GiB" virtual address spaces so he'll probably completely destroy this advantage. The other reason for applications to use 64-bit is that the extra registers and the extra width of registers makes code run faster. RDOS will probably also completely destroy the performance advantages too.

Basically, it is entirely possible to have a 64-bit user space without a 64-bit kernel; but after RDOS has completely destroyed any/all advantages of 64-bit it's going to be utterly pointless.
rdos wrote:
Brendan wrote:Of course all of the above assumes that you want to switch between 64-bit and 32-bit (which is what you asked), while staying in long mode all the time. If you actually want to switch between long mode and protected mode (which is not what you asked), then it's a bad idea (too many unsolvable race conditions - e.g. there's no way to change CPU modes, GDT, IDT and NMI handler in a safe/atomic way).
Why would there be a need to change CPU modes in an atomic way? If this is done by the scheduler as it schedules a new task, there are no atomic issues. NMI might be a problem, but as it is infrequently used for anything significant, I'll just ignore that issue. The mode switch would simply execute with interrupt disabled.
There's 3 problems - NMIs (which you're dodgy enough to ignore), Machine check exceptions (which you're probably dodgy enough to have never supported anyway), and IRQ latency (e.g. having IRQs disabled for *ages* while you switch CPU modes). I'm guessing you're dodgy enough to ignore the IRQ latency problem too.
rdos wrote:But I really don't lilke that AMD defined the state diagram for paging so that a switch between PAE and IA32e mode needs to go through disabling paging, but that is a solvable issue if the code is copied to a identity-mapped region, and the mode switch is carried out there. I've solved similar issues in the boot-strap code of AP cores.
CPU designers have a tendency to assume that only new code will use new CPU modes. I doubt AMD expected anyone to want to switch from long mode back to protected mode often. To be honest, I see it as a curiosity with dubious practical applications myself. They intended for new 64-bit kernels (that are capable of supporting old 32-bit and 16-bit applications), not the other way around.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
rdos
Member
Member
Posts: 3276
Joined: Wed Oct 01, 2008 1:55 pm

Re: Running 64-bit code in 32-bit x86 OS

Post by rdos »

Brendan wrote:For long mode; call gates must switch to 64-bit code; interrupts must switch to 64-bit code and instructions like SYSCALL/SYSENTER must switch to 64-bit code. The penalty of having stubs that switch back to 32-bit code is identical in all of these cases. It's a generic penalty (e.g. the penalty of not having a 64-bit kernel) that can't be avoided regardless of what you do (unless you have a 64-bit kernel).
They can be avoided with mode switches. If 32-bit applications run in protected mode and 64-bit applications run in long mode, both types of applications can use the fastest syscalls available on a particular processor and doesn't need penalty for stubs (well, both SYSENTER and SYSCALL does have penalties for stubs, but those are unavoidable and part of the design).
Brendan wrote:Being able to switching video modes (after boot) is not essential, and no amount of V86 is going to help for modern UEFI systems anyway.
It does work on all the EFI/UEFI systems I've tested, but I haven't tested Macs and similar.
Brendan wrote:Switching between long mode and protected mode means completely destroying TLBs and reloading almost everything (TSS, IDT, all segment registers, etc).
1. CR3 will always be reloaded when switching from a 32-bit application to a 64-bit application (or the reverse), because they will not use the same page tables (different applications), and thus the TLB flush is inevitable

2. TR register is always reloaded with every thread-switch (per thread SS0 and IO-bitmaps)

3. Segment registers will always be reloaded on thread switches.

The only additional thing that normally won't need to be reloaded is IDTR, and the changes to CR0.
Brendan wrote:Switching back again is equally expensive. For 64-bit applications; the total cost of this (including TLB misses, etc and not just the switch itself) is going to be several thousand cycles for every system call, IRQ and exception.
IRQs and syscalls will not switch mode. Only the scheduler will switch mode as it switches between a 32-bit and 64-bit process or the reverse.
Brendan wrote:Running 32-bit code (e.g. the legacy kernel) in long mode and completely avoiding the massive amount of overhead involved with switching CPU modes is likely to be many orders of magnitude faster.
Absolutely, but by switching mode in the scheduler, the fast paths for 32-bit applications can be maintained and no stubs will be needed for syscalls.
Brendan wrote:There are 2 main reasons for applications to use 64-bit. The first reason is that the application needs (or perhaps only benefits from rather than needing) the extra virtual address space. RDOS's 32-bit kernel probably won't be able to handle "greater than 4 GiB" virtual address spaces so he'll probably completely destroy this advantage.
Buffers in syscalls will need to be memmapped into the 32-bit address space. Other than that, 64-bit applications are free to use the entire address space with no penalties. Compared to the cost of syscalls, remapping buffers is a minor overhead.
Brendan wrote:The other reason for applications to use 64-bit is that the extra registers and the extra width of registers makes code run faster. RDOS will probably also completely destroy the performance advantages too.
Why? The application is free to use as many of the 64-bit registers it wants. The scheduler will need to save/restore additional state for 64-bit threads, but that overhead is required in any design.
Brendan wrote:There's 3 problems - NMIs (which you're dodgy enough to ignore),
Yes, I ignore them. I even setup NMI as a crash handler. :mrgreen:
Brendan wrote:Machine check exceptions (which you're probably dodgy enough to have never supported anyway),
Exactly :mrgreen:
Brendan wrote:and IRQ latency (e.g. having IRQs disabled for *ages* while you switch CPU modes). I'm guessing you're dodgy enough to ignore the IRQ latency problem too.
There is no larger IRQ latency involved in flushing the TLB with CR3 and with a change from PAE to IA32e or the reverse. Both flushes the TLB. It requires a few more instructions to change mode, and reload CR0, CR3 and IDTR, but I suspect this time is minor compared to the effects of flushing TLB.
Brendan wrote:CPU designers have a tendency to assume that only new code will use new CPU modes. I doubt AMD expected anyone to want to switch from long mode back to protected mode often. To be honest, I see it as a curiosity with dubious practical applications myself. They intended for new 64-bit kernels (that are capable of supporting old 32-bit and 16-bit applications), not the other way around.
Erm. If they had avoided to break existing modes the above would be logical, but this is not the case. I see the 32-bit segmented mode as the "super mode" of the processor, and 64-bit, 16-bit and V86 as modes that are better run as sub-modes. Again, the penalty of translating segment:offset pointers from 16/32 bit protected mode (and ensuring no buffer overruns occur) is larger than the penaltly of remapping 64-bit addresses into a 32-bit address space (and ensuring no buffer overruns occur).
User avatar
bluemoon
Member
Member
Posts: 1761
Joined: Wed Dec 01, 2010 3:41 am
Location: Hong Kong

Re: Running 64-bit code in 32-bit x86 OS

Post by bluemoon »

rdos wrote:2. TR register is always reloaded with every thread-switch (per thread SS0 and IO-bitmaps)
No, You don't need to reload TR (ie. LTR instruction), you may just modify the content of TSS (by MOV and remap different page for io map).
Post Reply