Page 2 of 3

Re: fast (short?) switch between compat mode and 64 bit mode

Posted: Thu Aug 11, 2022 12:53 am
by rdos
xeyes wrote: Segmentation to protecting the kernel against (other parts) of itself? As even user space 64bit code bypasses segmentation?
It's mainly used to protect drivers from each other. User space applications run in 32-bit flat mode. Although, I could theoretically create a new 32-bit executable format that can use segmentation. I once supported NE (16-bit Windows format), but I've not used it for a long time, so it is probably broken. However, this was a great format for discovering buffer errors in the application.
xeyes wrote:
rdos wrote: The other argument was that FS drivers would benefit from using long mode, but I decided to use non-mapped physical addresses in the disc API instead, which avoids the problem of large disc caches consuming a lot of kernel memory.
My FS stack uses 32b byte offset all the way down to disk level. So it is now limited not only to 4GB partitions, but also to partitions that are fully contained within the first 4GB of the disk. Enabling 'fast 64b mul/div' seems like a good way to support bigger disks.
My current (stable) FS does this too, but I've extended everything to 64-bits in the new FS that is based on server processes and that doesn't map disc buffers in linear kernel memory. Rather, the server processes will map metadata only to it's local userspace memory area. I will also memory map file data in user space to improve file-IO speed. The current implementation sends a buffer through a syscall, which is very inefficient for small reads.

In essence, this concept shows that it is possible to support large files & FS caches in 32-bit mode by using better design methods.
xeyes wrote:
rdos wrote:So, I see absolutely no reason why I would want to move to long mode at the moment...
Agreed, strongly :lol:

btw. I've seen the rdos name many times in various configure.host files. Congratulations on getting into all of them! How did you make this happen? Does it have anything to do with the commercial background of the OS?
A bit of luck and a lot of determination. The bad thing is that I decided to abandon GCC and switched to OpenWatcom where I got commit access so I could fix the library in a much easier way than sending patches to the mailing lists.

Re: fast (short?) switch between compat mode and 64 bit mode

Posted: Thu Aug 11, 2022 1:45 am
by rdos
nullplan wrote:
xeyes wrote:Of course not switching is faster, but is there a way to enable IDT events to be dispatched to a compat mode CS?
There is not. Intel SDM, Volume 3A, page 6-17 ("interrupt handling in 64-bit mode"):
The target code segment referenced by the interrupt gate must be a 64-bit code segment (CS.L = 1, CS.D = 0). If the target is not a 64-bit code segment, a general-protection exception (#GP) is generated with the IDT vector number as the error code.
In long mode, all IDT events must be vectored into 64-bit mode. Now, you can transition to 32-bit mode very quickly, but the first instructions must be in 64-bit mode, and the interrupt frame will be a 64-bit one (so it will include SS and RSP even if no CPL change happened, and all fields will be eight bytes).
It's possible to do a far call to a 32-bit handler as the first instruction in the IRQ handler, and then end it with ret. I did this by creating long mode stubs that called 32-bit registered handlers. This way, the same code would handle IRQs regardless if the processor run in long mode or protected mode.

However, this is for interrupts only. Exceptions in 64-bit mode should be native. They could use the same code to signal errors though.

Re: fast (short?) switch between compat mode and 64 bit mode

Posted: Mon Aug 15, 2022 1:22 am
by xeyes
Octocontrabass wrote:
xeyes wrote:The kernel can, and indeed sometimes do, run on 32bit CPUs.
You can have your bootloader select the appropriate kernel binary according to the CPU capabilities instead of trying to stuff everything into a single binary.
:lol: Just need to find the time to work on a custom smart boot loader and another kernel together with everything else?

I do plan to split it if the kernel actually becomes "stuffed" one day, but this thin long mode shim is like 0.01% of the codebase and can't stand alone by itself anyways.
Octocontrabass wrote: Perhaps you should try the x32 ABI. You get all the 64-bit instructions you want while staying in the 32-bit address space you've been using.
This is quite interesting, maybe x32 is better than 64bit for user space apps as the kernel doesn't support address space beyond 4GB. I wonder does it suffer from the issue of "since nobody uses it, nobody maintains this version" though?
Octocontrabass wrote: Opcodes 0x9A and 0xEA - far CALL and far JMP with the destination encoded as an immediate value - don't have a MOD field, so that can't be it. AMD could have used REX.W to extend them in 64-bit mode, since they already use the 0x66 prefix to select between 16:16 and 16:32 in other modes. Instead, AMD decided those two opcodes would be invalid in 64-bit mode.
You are right, MOD has nothing to do with immediate value. I guess if this was implemented we'd also have things like 64b immediate value for ADD?
Octocontrabass wrote: But the Intel manual also doesn't define any opcodes that use those encodings. That's probably some kind of editing mistake.
Might point to something that they worked on/planned for and abandoned.

Similar to this story:
Before the Pentium went into production, PAE was removed. Even so, documented references to this feature still appear in the various Pentium manuals, and from other sources.
(from http://www.rcollins.org/ddj/Jul96/)
Octocontrabass wrote: Intel has to test every feature they add to the CPU. Why add a feature that will increase the amount of testing if hardly anyone is going to use it?
Also known as, "cut some corners in places where we can't easily see."

Re: fast (short?) switch between compat mode and 64 bit mode

Posted: Mon Aug 15, 2022 1:44 am
by xeyes
rdos wrote:
xeyes wrote: Segmentation to protecting the kernel against (other parts) of itself? As even user space 64bit code bypasses segmentation?
It's mainly used to protect drivers from each other. User space applications run in 32-bit flat mode. Although, I could theoretically create a new 32-bit executable format that can use segmentation. I once supported NE (16-bit Windows format), but I've not used it for a long time, so it is probably broken. However, this was a great format for discovering buffer errors in the application.
I saw your other posts on this and yes it makes sense to have some sort of protecion among the different parts (esp. drivers) of the kernel for better stability.

rdos wrote: My current (stable) FS does this too, but I've extended everything to 64-bits in the new FS that is based on server processes... I will also memory map file data in user space to improve file-IO speed. The current implementation sends a buffer through a syscall, which is very inefficient for small reads.

In essence, this concept shows that it is possible to support large files & FS caches in 32-bit mode by using better design methods.
Yes, as long as address windowing is used, there's no need for a huge addr space at all in order to support bigger file systems. An extension to the concept of memory segments in a way :)

Maybe the '4GB disk limit' issue is more common than I thought among 32b kernels. I felt like it was a design mistake on my side though, as a disk layer that natively operate on 512B sectors could have supported 2TB instead.

mmap support is something I need to work on for sure. I did a bit of GUI explorations recently and sending pixels though pipes is indeed very slow.
rdos wrote:
A bit of luck and a lot of determination. The bad thing is that I decided to abandon GCC and switched to OpenWatcom where I got commit access so I could fix the library in a much easier way than sending patches to the mailing lists.
Determination as in wirting lots of patches or lots of emails to persuade various people?

GCC seems somewhat monolithic and might actually be quite hard to add much non-flat segment support. I'm sure you might have considered this already but LLVM might be more modular?

Re: fast (short?) switch between compat mode and 64 bit mode

Posted: Mon Aug 15, 2022 1:48 am
by xeyes
nullplan wrote:
xeyes wrote:Of course not switching is faster, but is there a way to enable IDT events to be dispatched to a compat mode CS?
There is not. Intel SDM, Volume 3A, page 6-17 ("interrupt handling in 64-bit mode"):
The target code segment referenced by the interrupt gate must be a 64-bit code segment (CS.L = 1, CS.D = 0). If the target is not a 64-bit code segment, a general-protection exception (#GP) is generated with the IDT vector number as the error code.
In long mode, all IDT events must be vectored into 64-bit mode. Now, you can transition to 32-bit mode very quickly, but the first instructions must be in 64-bit mode, and the interrupt frame will be a 64-bit one (so it will include SS and RSP even if no CPL change happened, and all fields will be eight bytes).
That question was somewhat rhetorical.

But yes, I do want the shim to "transition to 32-bit mode very quickly" and this thread is an attempt at finding out how to do so.

Re: fast (short?) switch between compat mode and 64 bit mode

Posted: Mon Aug 15, 2022 2:30 am
by rdos
xeyes wrote:
rdos wrote: My current (stable) FS does this too, but I've extended everything to 64-bits in the new FS that is based on server processes... I will also memory map file data in user space to improve file-IO speed. The current implementation sends a buffer through a syscall, which is very inefficient for small reads.

In essence, this concept shows that it is possible to support large files & FS caches in 32-bit mode by using better design methods.
Yes, as long as address windowing is used, there's no need for a huge addr space at all in order to support bigger file systems. An extension to the concept of memory segments in a way :)

Maybe the '4GB disk limit' issue is more common than I thought among 32b kernels. I felt like it was a design mistake on my side though, as a disk layer that natively operate on 512B sectors could have supported 2TB instead.
The disc limit is another issue. Initially, I tried to factor the number of sectors on a disc by using two numbers of similar magnitude. This reflected how the disc cache is organized with an upper level consisting of an array of pointers to the actual sector cache. In this configuration using 16-bits for both numbers, I could at most support 4G sectors (2 TB). However, I've changed this so the upper level is now 32-bits while the lower is still 16-bits, and so I can support 48-bit sector addresses. I'm not sure if partitions have a 4G sector limit too, or if I've fixed this. If not, it should be fixed.
xeyes wrote: Determination as in wirting lots of patches or lots of emails to persuade various people?
Both, I think.
xeyes wrote: GCC seems somewhat monolithic and might actually be quite hard to add much non-flat segment support. I'm sure you might have considered this already but LLVM might be more modular?
I only tried to integrate flat mode applications into GCC. To add segmentation seems to be an more or less impossible issue. That's also why I picked OpenWatcom instead. It could support both flat applications and segmented drivers. Another problem I had with GCC and their strange assembly syntax is that I cannot share syscall definitions between GCC and OW, which is quite problematic. I can't even share the syscall indexes.

Re: fast (short?) switch between compat mode and 64 bit mode

Posted: Mon Aug 15, 2022 11:24 am
by Octocontrabass
xeyes wrote: :lol: Just need to find the time to work on a custom smart boot loader and another kernel together with everything else?
Who said anything about another kernel? It would be the same kernel, just compiled to two separate binaries. And if you don't want a smart bootloader, use a smart installer instead, and install the correct kernel binary according to the CPU architecture.
xeyes wrote:This is quite interesting, maybe x32 is better than 64bit for user space apps as the kernel doesn't support address space beyond 4GB. I wonder does it suffer from the issue of "since nobody uses it, nobody maintains this version" though?
I can't speak for how well it's maintained, but someone must be using it; otherwise Linux would have dropped support for it by now.
xeyes wrote:You are right, MOD has nothing to do with immediate value. I guess if this was implemented we'd also have things like 64b immediate value for ADD?
Probably not. The existing design was chosen to fit the most common needs, and it turns out 64-bit immediate values aren't needed very often.
xeyes wrote:Might point to something that they worked on/planned for and abandoned.
Maybe! They would fit pretty nicely alongside some of the other extensions Intel has made to the x64 architecture.
xeyes wrote:Also known as, "cut some corners in places where we can't easily see."
No, cutting corners is how you get things like unreal mode. (Although according to rumors, unreal mode exists because Intel ran out of space in the i386 microcode to enforce proper real mode segment descriptors...)

Re: fast (short?) switch between compat mode and 64 bit mode

Posted: Mon Aug 15, 2022 1:13 pm
by rdos
xeyes wrote: Similar to this story:
Before the Pentium went into production, PAE was removed. Even so, documented references to this feature still appear in the various Pentium manuals, and from other sources.
(from http://www.rcollins.org/ddj/Jul96/)
Octocontrabass wrote: Intel has to test every feature they add to the CPU. Why add a feature that will increase the amount of testing if hardly anyone is going to use it?
Also known as, "cut some corners in places where we can't easily see."
PAE is a great feature, and helps in sharing page tables between protected mode & long mode. In fact, PAE can be thought of as part of the long mode address space (the first 4G), and it can be shared between long mode & compat mode drivers.

In the documentation, it is claimed the physical address is 36-bits, but I'm pretty sure this is not that case. At least not in threadripper. Instead, they are full 64-bit physical addresses and so you can address the same amount of physical memory with PAE as with long mode.

Re: fast (short?) switch between compat mode and 64 bit mode

Posted: Mon Aug 15, 2022 1:36 pm
by Octocontrabass
rdos wrote:In the documentation, it is claimed the physical address is 36-bits, but I'm pretty sure this is not that case.
Which documentation? Both the Intel and AMD manuals clearly state that PAE supports the full physical address space.
rdos wrote:Instead, they are full 64-bit physical addresses
Physical addresses in x64 can never be more than 52 bits, and current CPUs are limited to around 48. You can use CPUID to check the actual number of physical address bits supported by your CPU.

Re: fast (short?) switch between compat mode and 64 bit mode

Posted: Mon Aug 15, 2022 3:27 pm
by rdos
Octocontrabass wrote:
rdos wrote:In the documentation, it is claimed the physical address is 36-bits, but I'm pretty sure this is not that case.
Which documentation? Both the Intel and AMD manuals clearly state that PAE supports the full physical address space.
The link above. However, at that time 36-bits probably was enough as the Pentium didn't support more physical memory.
Octocontrabass wrote:
rdos wrote:Instead, they are full 64-bit physical addresses
Physical addresses in x64 can never be more than 52 bits, and current CPUs are limited to around 48. You can use CPUID to check the actual number of physical address bits supported by your CPU.
Not necessary. If the motherboard supports the installed memory and reports it through BIOS / EFI so must the CPU too.

Re: fast (short?) switch between compat mode and 64 bit mode

Posted: Mon Aug 15, 2022 3:42 pm
by Octocontrabass
rdos wrote:The link above. However, at that time 36-bits probably was enough as the Pentium didn't support more physical memory.
That documentation is at least 25 years old. You shouldn't be surprised that things have changed since then!

Re: fast (short?) switch between compat mode and 64 bit mode

Posted: Sat Aug 20, 2022 8:29 pm
by xeyes
rdos wrote:
The disc limit is another issue. Initially, I tried to factor the number of sectors on a disc by using two numbers of similar magnitude. This reflected how the disc cache is organized with an upper level consisting of an array of pointers to the actual sector cache. In this configuration using 16-bits for both numbers, I could at most support 4G sectors (2 TB). However, I've changed this so the upper level is now 32-bits while the lower is still 16-bits, and so I can support 48-bit sector addresses. I'm not sure if partitions have a 4G sector limit too, or if I've fixed this. If not, it should be fixed.
That's a good design, 48b covers the whole LBA48 range and no 64b math necessary.

When I said '32b offset all the way' it truly meant, my block cache also uses byte offset despite operating on aligned blocks, so a bunch of lsbs are always 0 and wasted.
rdos wrote:
xeyes wrote: Determination as in wirting lots of patches or lots of emails to persuade various people?
Both, I think.
xeyes wrote: GCC seems somewhat monolithic and might actually be quite hard to add much non-flat segment support. I'm sure you might have considered this already but LLVM might be more modular?
I only tried to integrate flat mode applications into GCC. To add segmentation seems to be an more or less impossible issue. That's also why I picked OpenWatcom instead. It could support both flat applications and segmented drivers. Another problem I had with GCC and their strange assembly syntax is that I cannot share syscall definitions between GCC and OW, which is quite problematic. I can't even share the syscall indexes.
That seems strange, I was under then impression that bin-utils and GCC are so closely coupled that they'd have the same policies unless you try to make huge changes like segment awareness.

But having native support inside OW that works for you special protections in the kernel is nice, more so if you like their syntax.

Aren't syscall index just small integers that would work with any toolchain? Or do you also have some special designs there?

Re: fast (short?) switch between compat mode and 64 bit mode

Posted: Sat Aug 20, 2022 8:38 pm
by xeyes
Octocontrabass wrote:
xeyes wrote: :lol: Just need to find the time to work on a custom smart boot loader and another kernel together with everything else?
Who said anything about another kernel? It would be the same kernel, just compiled to two separate binaries. And if you don't want a smart bootloader, use a smart installer instead, and install the correct kernel binary according to the CPU architecture.
That is what I technically have right now :lol: The shim is a compile time option not switched on/off at run time.

But I know that there are quite a bit of stuff that needs to change to become a "true 64b" kernel #-o
Octocontrabass wrote:
xeyes wrote:This is quite interesting, maybe x32 is better than 64bit for user space apps as the kernel doesn't support address space beyond 4GB. I wonder does it suffer from the issue of "since nobody uses it, nobody maintains this version" though?
I can't speak for how well it's maintained, but someone must be using it; otherwise Linux would have dropped support for it by now.
Really? I think some distro people are trying to move away from even 'multilib' now, x32 support might be similar to "We support sparc and itanium".
Octocontrabass wrote:
xeyes wrote:You are right, MOD has nothing to do with immediate value. I guess if this was implemented we'd also have things like 64b immediate value for ADD?
Probably not. The existing design was chosen to fit the most common needs, and it turns out 64-bit immediate values aren't needed very often.
My guess is that if they were to add it, the change would be an addition that enables 64b imm everywhere? Doesn't seem to make sense to make just ljmp special?
Octocontrabass wrote: according to rumors, unreal mode exists because Intel ran out of space in the i386 microcode to enforce proper real mode segment descriptors...
How true is this rumor?

The version I heard: unreal mode is needed for SMM.

Re: fast (short?) switch between compat mode and 64 bit mode

Posted: Sat Aug 20, 2022 11:13 pm
by Octocontrabass
xeyes wrote:Really? I think some distro people are trying to move away from even 'multilib' now, x32 support might be similar to "We support sparc and itanium".
There was a whole discussion about it a few years ago.
xeyes wrote:My guess is that if they were to add it, the change would be an addition that enables 64b imm everywhere? Doesn't seem to make sense to make just ljmp special?
Having 64-bit immediates everywhere would need to be an actual addition to the instruction set, because the sign-extended 32-bit immediates already use the 64-bit operand size.

The reason MOV with a 64-bit operand size allows both a sign-extended 32-bit immediate and a 64-bit immediate is because there were two ways to encode identical MOV instructions with 16-bit and 32-bit operand sizes. AMD split the two redundant MOV instructions into two separate MOV instructions. Most other instructions do not have convenient redundant encodings the way MOV did. (And MOV only had redundant encodings when the destination was a register!)
xeyes wrote:How true is this rumor?

The version I heard: unreal mode is needed for SMM.
If it were intended for SMM, Intel would have made it only work in SMM the way they did with the RSM instruction. Plus, the timeline doesn't add up: unreal mode was possible on the earliest commercially-available 386, but CPUs with SMM support weren't released until about 5 years later.

Michal Necasek has collected a lot of good information about unreal mode.

Re: fast (short?) switch between compat mode and 64 bit mode

Posted: Mon Aug 22, 2022 1:23 am
by rdos
xeyes wrote: Aren't syscall index just small integers that would work with any toolchain? Or do you also have some special designs there?
I code my syscalls as invalid instructions that have a 32-bit integer to indicate the syscall number. The instruction is then typically patched by the executable loader to a call gate.

When processor runs in long mode, the loader instead will patch to a syscall instruction with the number placed on the user mode stack since long mode doesn't support call gates. In kernel, the syscall entry point needs to find the number and dispatch the correct server procedure with a far call.

When syscalls are performed in kernel or a driver, the syscall (or driver call, which is similar) will instead be patched to a direct far call.

Because GDT selectors are a scarce resource, call gates are allocated & setup on reference from user mode only.

The problem in relation to GCC is that parameters must be passed in registers, and so I need to define function prototypes that load the correct registers and then do the syscall. These are quite different between GCC and OW.

This design is why I also can support 16-bit applications. 16-bit applications will pass 16-bit registers, and these are extended to 32-bit by the server. This is done by the server registering both a 16-bit and a 32-bit entry point (or a bimodal, in case this extension is not needed). DOS applications are supported by aliasing a selector, and long mode by the use of paging. Thus, I don't need any additional translation layers and I will not be tied-up to a specific stack content.