Page 2 of 2
Re: Errors about long mode
Posted: Thu Jan 03, 2013 11:37 am
by Brendan
Hi,
rdos wrote:Does that mean that loading FS and GS in long mode directly updates the base? Does it mean that if wrmsr is used to load FS.base or GS.base, it makes it possible for compability-mode to access memory above 4G through FS and GS? I think that might be worth testing.
I'd assume that (in compatibility mode) the CPU converts the virtual address into a linear address, then masks off (or ignores) the highest 32-bits to form a 32-bit linear address, then converts the resulting 32-bit linear address into a physical address.
Note that this doesn't apply to FS and GS alone. For example, you could set ES base to 0xFFFFFFF0 so that "ES:0x55555565" refers to the linear address 0x0000000155555555; and I'd assume the effective linear address that is actually used is 0x55555555.
Cheers,
Brendan
Re: Errors about long mode
Posted: Thu Jan 03, 2013 2:11 pm
by rdos
Brendan wrote:
I'd assume that (in compatibility mode) the CPU converts the virtual address into a linear address, then masks off (or ignores) the highest 32-bits to form a 32-bit linear address, then converts the resulting 32-bit linear address into a physical address.
Note that this doesn't apply to FS and GS alone. For example, you could set ES base to 0xFFFFFFF0 so that "ES:0x55555565" refers to the linear address 0x0000000155555555; and I'd assume the effective linear address that is actually used is 0x55555555.
Confirmed to be true. I loaded base with 0xFFFFF000, limit with 0xFFFFFFFFFF, and then accessed at start of long mode driver code selector + 1000h, and this gives the same result as reading the first dword in the code selector.
Re: Errors about long mode
Posted: Thu Jan 03, 2013 2:17 pm
by Owen
Quite a lot of code requires accessing "negative" offsets from a selector (normally GS or FS; it's common e.g. in Linux thread local storage)
Re: Errors about long mode
Posted: Thu Jan 03, 2013 2:36 pm
by rdos
OK, here are the results (for AMD) in long mode:
1. Accessing the same location: This gives a page fault, indicating that the calculation is not truncated. (addresses directly above 4G are not valid in my setup)
2. Accessing offset 0: This also triggers a page fault, obviously at 0xFFFFF000
3. Load a flat selector to fs: This loads the linear address without translation (same result as in compability-mode)
4. Load a null selector to fs: This doesn't protection-fault, and doesn't use the linear address directly either. I'm not sure where the data comes from, but most likely it is the data from the last valid load of FS (which would be in the scheduler)
I think this answers some of the questions:
1. The linear address loaded into FS by selector loads in long mode or compability mode is used when referencing through FS in long mode
2. There is no check for null-selector in long mode, but rather it seems the last loaded base is used
3. Calculations doesn't wrap around like in compability-mode
While I haven't tested this, it seems likely that if the base is loaded through the msr, the base would be used in compability-mode, but the result would then be truncated to 32-bits.
This has some implications for FS and GS usage in a mixed-bitness model:
1. If FS is saved and restored (for instance in an ISR or exception handler), the base loaded with wrmsr no longer is used in long mode unless FS is 0.
2. If an ISR loads FS, and even if it also saves / restores FS, the base written with wrmsr is lost. Either the base is the base of FS as loaded in long mode (FS not 0), or it is the base loaded in the ISR.
For my setup this means that I won't be able to use FS as a TLS-selector in long mode (because FS is used in IRQs), but I would be able to use GS provided I forbid IRQs to use GS (GS is not saved in the entry/exit prolog). Additionally, it would be best to load GS with null selector, as this setup can handle save / restore of GS (but not loading non-null selectors). Additionally, the scheduler must use the MSR to write the base as it switches to a long mode task, and the base must also be reloaded just before sysret. To be really paranoid, I might even check if GS is loaded with 0 in the exit code for IRQs, and then write the MSR as well as this won't hurt anything like if the IRQ returns to compability mode.
Re: Errors about long mode
Posted: Thu Jan 03, 2013 6:23 pm
by Brendan
Hi,
rdos wrote:OK, here are the results (for AMD) in long mode:
Cool
The next step would be to determine which parts of the results are architectural (and therefore guaranteed to work the same in all CPUs made in the past and in the future by all manufacturers) and which results are merely accidental (and therefore guaranteed to comply with
"Murphy's Law").
Based on pure "gut instinct", I'd assume that:
- it's safe to rely on linear addresses being truncated to 32-bit in compatibility mode
- it's safe to rely on DS, ES and SS loads causing the base address to be set (and ignored despite being set when in long mode)
- it's safe to rely on FS and GS loads causing the base address to be set; and that saving and loading FS or GS will cause any base address set via. MSR or via. SWAPGS to be overwritten
- the ability to load NULL into FS or GS is a bug that may not exist on Intel or Via CPUs and may be fixed in future AMD CPUs
Cheers,
Brendan
Re: Errors about long mode
Posted: Fri Jan 04, 2013 1:01 am
by rdos
Brendan wrote:the ability to load NULL into FS or GS is a bug that may not exist on Intel or Via CPUs and may be fixed in future AMD CPUs
The ability to load NULL into FS and GS is sure to exist on all CPUs. The bug would be the ability to use a selector even if it is not valid (NULL) in long mode. OTOH, it seems like exceptions would clear SS in long mode, which means that the CPU obviously can use SS in long mode even if it is not valid. Another uncertain aspect is which base is used in long mode when FS or GS is NULL, and no MSR load has been done after loading with NULL. I would play it safe and assume the worst, IOW that loading FS or GS with null might update the base, and possibly also load a flat selector into whatever selector I'd use for TLS in case referencing with NULL selector might not work on all CPUs.