Adding 64-bit support to RDOS

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
rdos
Member
Member
Posts: 3306
Joined: Wed Oct 01, 2008 1:55 pm

Adding 64-bit support to RDOS

Post by rdos »

I think I'll change to a different thread now that PAE paging is done.

So, the first few steps have already been taken:

1. Physical memory allocation uses 64-bits
2. A new physical memory interface that can keep large amounts of physical memory below 4G linear address space based on bitmaps is implemented
3. PAE paging can be used instead of 32-bit paging.
4. A method to make syscalls from applications based on the SYSENTER interface rather than call-gates is implemented

The next step has to do with the SYSENTER interface. Even if support for this interface is added, it also requires device-driver support, and currently only the APIC module supports SYSENTER. In order for device-drivers to work with the SYSENTER interface, they are not allowed to reference the stack as a 16-bit stack, and some device-drivers currently do that to save things on the stack. The SYSENTER interface uses a flat kernel stack in order to not have to reload SS.

After that is done, the fun begins. I need to chose a proper assembler that supports 64-bit, and that can handle my include-files without trouble (I suspect NASM faills here). Then I need to write default exception handlers that dumps register contents to screen and halts the system. The crash debugger needs to become 64-bit aware so it can be invoked from long mode and display the register contents in the proper formats. It must be able to handle some core being in long mode and some in protected mode.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Adding 64-bit support to RDOS

Post by Brendan »

Hi,
rdos wrote:4. A method to make syscalls from applications based on the SYSENTER interface rather than call-gates is implemented
For 64-bit code, SYSENTER won't work on AMD CPUs and you have to use SYSCALL. Also note that if a 64-bit process uses SYSCALL (or SYSENTER on an Intel CPU) the application's stack is likely to be above 4 GiB.
rdos wrote:I need to chose a proper assembler that supports 64-bit, and that can handle my include-files without trouble (I suspect NASM faills here).
NASM has supported 64-bit code for a long time now (since version 2.0 was released in 2007 I think). YASM is compatible with NASM (same syntax, preprocessor, etc) and has supported 64-bit for a little longer than NASM.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
rdos
Member
Member
Posts: 3306
Joined: Wed Oct 01, 2008 1:55 pm

Re: Adding 64-bit support to RDOS

Post by rdos »

Brendan wrote:Hi,
rdos wrote:4. A method to make syscalls from applications based on the SYSENTER interface rather than call-gates is implemented
For 64-bit code, SYSENTER won't work on AMD CPUs and you have to use SYSCALL. Also note that if a 64-bit process uses SYSCALL (or SYSENTER on an Intel CPU) the application's stack is likely to be above 4 GiB.
Yes, I know, but that should not be a problem. 64-bit code would use SYSCALL, and then end up in a long mode handler. The long mode handler will call the same protected mode procedure as the SYSENTER handler (a retf out of long mode), which is why I need to port all APIs to SYSENTER. The position of the call stack above 4G makes no difference as syscalls don't use callframes, rather are register-based. But pointers must be converted so they reside below 4G which would be done with paging. I need to add pointer-translations to the syscall definitions as well, and might just as well do this at the same time as I add support for SYSENTER.

Maybe it would be better to do these modifications later, and first see how the SYSCALL procedure would look like, and make sure it works properly?
User avatar
bluemoon
Member
Member
Posts: 1761
Joined: Wed Dec 01, 2010 3:41 am
Location: Hong Kong

Re: Adding 64-bit support to RDOS

Post by bluemoon »

SYSCALL works very similar with SYSENTER, with the hidden trouble for kernel stack handling for nested interrupt, etc; that we talked about a few months ago.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Adding 64-bit support to RDOS

Post by Brendan »

Hi,
rdos wrote:Maybe it would be better to do these modifications later, and first see how the SYSCALL procedure would look like, and make sure it works properly?
I'd start by writing a dummy 64-bit executable (e.g. an almost empty piece of 64-bit code that does nothing more than "jmp $"). You've probably already got some sort of kernel code to start an executable. I'd extend that "start an executable" code so that it detects if the executable is 64-bit or not; and if the executable is 64-bit it would construct a 64-bit virtual address space for it.

Next I'd implement an (almost empty) "64-bit kernel stub" and map that into all 64-bit virtual address spaces when they're created.

After that I'd provide some way for the 32-bit kernel to switch to long mode and pass control to the "64-bit kernel stub". The "64-bit kernel stub" (running at CPL=0) would pass control to the 64-bit executable (running at CPL=3).

The next piece would be adding a 64-bit IDT and "dummy" interrupt handlers; so that the 64-bit IRQ handlers switch back to protected mode and cause the normal IRQ handlers to be executed and then switch back to long mode. For some of these I'd be tempted to do native 64-bit interrupt handlers (e.g. the page fault handler because 32-bit code isn't going to handle a 64-bit CR2 or long mode paging tables; whatever interrupt you use for the "multi-CPU TLB shootdown" IPI; etc).

I wouldn't worry about supporting SYSCALL (for 64-bit applications) until after all of the above is done. Once SYSCALL is working with a "do nothing" pretend kernel function; I'd start adding support for "switch to protected mode and call the legacy/32-bit kernel API function and then switch back to long mode" (which would be sort of similar to the interrupt handling). Of course there will be kernel API functions that don't make any sense as 32-bit code. For example, half the virtual memory management. For all of these I'd handle the kernel API function with native 64-bit code in the "64-bit kernel stub".

Once all of that is working; I'd start optimising it by shifting more code from the legacy/32-bit kernel into native 64-bit code in the "64-bit kernel stub"; to avoid the overhead of constantly switching between long mode and protected mode (and completely screwing up all TLB entries in both directions). This would eventually include all interrupt handlers, all kernel API functions, etc.

Once enough has been shifted to 64-bit to allow 64-bit applications to run without any silly switching between long mode and protected mode, I'd port drivers to 64-bit. Then I'd create a completely separate/different "stripped down" version of the OS that only supports 64-bit applications and 64-bit drivers (e.g. the "64-bit stub" with no legacy/protected mode kernel at all).

Finally, I'd start adding support for 32-bit "flat" applications to the "64-bit stripped down version of the OS"; and remove the (now completely unnecessary) support for 64-bit applications from the legacy/protected mode kernel.

The end result would be a completely rewritten OS that is "good" (that supports 64-bit applications and drivers, and 32-bit "flat" applications and 32-bit "flat" drivers); and a completely separate OS that is the same as the what you have now.

However; someone once said that the shortest path between 2 points is a straight line; and I have a feeling that there might be a much faster/easier way to get to the same "2 completely different versions of the OS" end result. 8)


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
rdos
Member
Member
Posts: 3306
Joined: Wed Oct 01, 2008 1:55 pm

Re: Adding 64-bit support to RDOS

Post by rdos »

Brendan wrote:Hi,
rdos wrote:Maybe it would be better to do these modifications later, and first see how the SYSCALL procedure would look like, and make sure it works properly?
I'd start by writing a dummy 64-bit executable (e.g. an almost empty piece of 64-bit code that does nothing more than "jmp $"). You've probably already got some sort of kernel code to start an executable. I'd extend that "start an executable" code so that it detects if the executable is 64-bit or not; and if the executable is 64-bit it would construct a 64-bit virtual address space for it.

Next I'd implement an (almost empty) "64-bit kernel stub" and map that into all 64-bit virtual address spaces when they're created.

After that I'd provide some way for the 32-bit kernel to switch to long mode and pass control to the "64-bit kernel stub". The "64-bit kernel stub" (running at CPL=0) would pass control to the 64-bit executable (running at CPL=3).

The next piece would be adding a 64-bit IDT and "dummy" interrupt handlers; so that the 64-bit IRQ handlers switch back to protected mode and cause the normal IRQ handlers to be executed and then switch back to long mode. For some of these I'd be tempted to do native 64-bit interrupt handlers (e.g. the page fault handler because 32-bit code isn't going to handle a 64-bit CR2 or long mode paging tables; whatever interrupt you use for the "multi-CPU TLB shootdown" IPI; etc).
That seems pretty backwards from my point of view. You cannot do anything with your empty 64-bit executable without support for 64-bit since we start with an 32-bit OS. The 32-bit OS cannot create the 64-bit environment (in fact the 64-bit loader must reside in a long mode driver).

Thus, the first step must be to be able to run ordinary 32-bit applications in long mode using IA32e. This in turn require functional 64-bit exception handlers and IRQ handlers.

I think I'll start testing if NASM can generate an ordinary 32-bit device-driver. This device-driver can then start a kernel thread, which could make the switch to long mode and setup the default exception handlers and stuff, tear down the environment and return.
rdos
Member
Member
Posts: 3306
Joined: Wed Oct 01, 2008 1:55 pm

Re: Adding 64-bit support to RDOS

Post by rdos »

Some troubles with NASM, but I finally got my 32-bit driver using NASM binary output to work. First, it was tricky to fool NASM to start with offset 0 after the header (could be achieved with org 0xFFFFFFEE). Second, it appears that NASM cannot handle constants defined with assignment (=) operator, so I had to generate new header-files with gate-numbers for NASM. However, after that, it works pretty well.

Next, I'll try to switch to long mode and back to protected mode without generating tripple faults.

There is an additional complication involved. In order to switch between protected mode and long mode (and the reverse), I need to go through a stage with paging disabled. That means the switch must be made in a unity-mapped section of code, preferently at the bottom of physical memory. I think reserving 16 pages at the lower end of physical memory during boot-up, and then copying the NASM-based device-driver there unity-mapped could achieve the goal.

I'll also need to create a new kernel process, not a kernel thread, so I can manipulate the paging environment without affecting other kernel-mode threads.

Edit: I can now turn off paging and enable long mode, but the processor will tripple fault when paging is turned on for long mode. I changed CR3 by adding another level to the page translation scheme from PAE-paging, but this doesn't seem to work.

Test code: (when the inner enable / disable paging is removed, the code works)

Code: Select all

    mov ax,flat_sel
    mov es,ax
    mov edi,12000h
    mov eax,cr3
    or ax,3
    stosd
    xor eax,eax
    mov ecx,1023
    rep stosd    
;    
    mov ebp,cr3
    cli
    mov eax,cr0
    and eax,7FFFFFFFh
    mov cr0,eax
;
    mov ecx,IA32_EFER
    rdmsr
    or eax,0x100
    wrmsr
;
    mov eax,12000h
    mov cr3,eax  
;
    mov eax,cr0
    or eax,80000000h
    mov cr0,eax
;
    mov edx,12345h
    mov ecx,98765h
;    
    mov eax,cr0
    and eax,7FFFFFFFh
    mov cr0,eax
;
    mov ecx,IA32_EFER
    rdmsr
    and eax,0xFFFFFEFF   
    wrmsr
;
    mov cr3,ebp          
;
    mov eax,cr0
    or eax,80000000h
    mov cr0,eax
    sti
    int 3    
rdos
Member
Member
Posts: 3306
Joined: Wed Oct 01, 2008 1:55 pm

Re: Adding 64-bit support to RDOS

Post by rdos »

Now I know why the code above fails. It is not the switch to long mode that fails (it works perfectly well), but instead the problem is when reentering protected mode in PAE-mode. The definitions of the page table ptr entries differs somewhat between IA32e and PAE, especially bit 5 is the accessed bit in IA32e and is reserved and must be 0 in PAE, which means that loading CR3 will trigger a protection fault as the lowest page table ptr is always accessed. This is not so interesting for practical purposes as there will be no switches to / from PAE using the same CR3, but in the test there is a need to clear the accessed bits in the page table ptr entries before reenabling PAE paging.
User avatar
Griwes
Member
Member
Posts: 374
Joined: Sat Jul 30, 2011 10:07 am
Libera.chat IRC: Griwes
Location: Wrocław/Racibórz, Poland
Contact:

Re: Adding 64-bit support to RDOS

Post by Griwes »

That's why you generally determine paging mode at boot and stick to it later... oh, I forgot RDOS is not a "general" OS, sorry.
Reaver Project :: Repository :: Ohloh project page
<klange> This is a horror story about what happens when you need a hammer and all you have is the skulls of the damned.
<drake1> as long as the lock is read and modified by atomic operations
rdos
Member
Member
Posts: 3306
Joined: Wed Oct 01, 2008 1:55 pm

Re: Adding 64-bit support to RDOS

Post by rdos »

There are some oddities in the non-paged mode that I cannot understand (at least on my dual core AMD). It returns strange things when reading unity-mapped memory. Because of this I changed the code so it uses two page table ptrs (the IA32e is created by copying the PAE version, and setting the lowest 3 bits). After that change, everything works very well, and I've now entered and left IA32e mode without any faults.

New code:

Code: Select all

;
    mov edx,2000h
    xor ebx,ebx
    mov eax,cr3
    or al,67h
    OsGate set_page_entry      ; map PAE CR3 to linear address 2000h
;    
    mov ax,flat_sel
    mov ds,ax
    mov es,ax
;
    mov esi,2000h
    mov edi,11000h
    mov ecx,400h
    rep movsd                         ; copy page table ptr to to IA32e version at linear and physical address 11000h
;
    mov edi,11000h
    mov al,7
    stosb                               ; patch to rd/wr and user mode
;    
    add edi,7    
    stosb
;    
    add edi,7    
    stosb
;    
    add edi,7    
    stosb
;
    mov edi,12000h                 ; create IA32e CR3 block
    mov eax,11007h
    stosd
    xor eax,eax
    mov ecx,1023
    rep stosd    
;    
    mov edi,cr3
;
    cli
    mov eax,cr0
    and eax,7FFFFFFFh
    mov cr0,eax
;
    mov ecx,IA32_EFER
    rdmsr
    or eax,0x100
    wrmsr
;
    mov eax,12000h
    mov cr3,eax  
;
    mov eax,cr0
    or eax,80000000h
    mov cr0,eax

    jmp longmode_code_sel:flush1
flush1:        
    mov eax,2000h
    mov ebx,[eax]
    mov ebp,[eax+8]
;    
    mov eax,cr0
    and eax,7FFFFFFFh
    mov cr0,eax
;
    mov ecx,IA32_EFER
    rdmsr
    and eax,0xFFFFFEFF   
    wrmsr
;
    mov cr3,edi
;
    mov eax,cr0
    or eax,80000000h
    mov cr0,eax
    sti
    int 3
rdos
Member
Member
Posts: 3306
Joined: Wed Oct 01, 2008 1:55 pm

Re: Adding 64-bit support to RDOS

Post by rdos »

Some further thoughts.

I think I'll add a new mode to CreateProcess, which currently supports protected mode or V86 mode. The new mode would be long mode, and it will create a new CR3 for long mode, and set some flag (long mode flag) in the thread-control block that indicates the process should run in IA32e mode rather than in protected mode. This flag would be inherited by threads created in the process.

The scheduler will then need to be updated so it can switch mode when it reloads CR3 when switching between processes. It will do an xor between for the long mode flag between the current thread and the new thread, and if the result is 0 it can just reload CR3. Otherwise, it will either call the long-mode driver and make it switch from long mode to protected mode (and reload CR3 at the same time), or the reverse. After that is done, it can just do the ordinary thing.
rdos
Member
Member
Posts: 3306
Joined: Wed Oct 01, 2008 1:55 pm

Re: Adding 64-bit support to RDOS

Post by rdos »

Some more progress. Now I could write "11" at top of the screen from 64-bit mode.

Code: Select all


    bits 32

    jmp long_kernel_code_sel:test64

    bits 64

test64:
    mov rbx,0xB8000
    mov eax,0x7310731
    mov [rbx],eax

stopl:
    jmp stopl        

Unfortunately, I'm no longer able to return to compability-mode, as I cannot find a way to do that. Although, I think it is better to implement the exception handlers, letting them write register contents first, and then I'll know why the code fails to return to compatibility-mode.
User avatar
Owen
Member
Member
Posts: 1700
Joined: Fri Jun 13, 2008 3:21 pm
Location: Cambridge, United Kingdom
Contact:

Re: Adding 64-bit support to RDOS

Post by Owen »

rdos wrote:Unfortunately, I'm no longer able to return to compability-mode, as I cannot find a way to do that.
Long mode's compatibility submode is entered by loading a 32-bit or 16-bit code segment (CS.L=0). When this is done the basic segmentation behavior (i.e. excluding system descriptors) behaves per protected mode.
rdos
Member
Member
Posts: 3306
Joined: Wed Oct 01, 2008 1:55 pm

Re: Adding 64-bit support to RDOS

Post by rdos »

Owen wrote:
rdos wrote:Unfortunately, I'm no longer able to return to compability-mode, as I cannot find a way to do that.
Long mode's compatibility submode is entered by loading a 32-bit or 16-bit code segment (CS.L=0). When this is done the basic segmentation behavior (i.e. excluding system descriptors) behaves per protected mode.
After some thoughts on this I think part of problem was that the stack is invalid. The thread had a segmented SS:ESP, but long mode skips the SS part, and uses only the offset, which probably caused page faults when I tried retf. I also tried jmp far [rbx], but none of the memory-layouts I used gave anything else than tripple faults. It's bad design that jmp seg:offset in not supported.
User avatar
Owen
Member
Member
Posts: 1700
Joined: Fri Jun 13, 2008 3:21 pm
Location: Cambridge, United Kingdom
Contact:

Re: Adding 64-bit support to RDOS

Post by Owen »

rdos wrote:It's bad design that jmp seg:offset in not supported.
JMP FAR indirect is supported, but note that it behaves differently between AMD and Intel:
  • On AMD, a REX.W prefix is ignored; there is no jmp far seg16:off64 (it is interpreted as seg16:off32)
  • On Intel, a REX.W prefix is honored; there is a jmp far seg16:off64
(I experimentally validated the differences between AMD and Intel's documentation 2 years ago)

If memory serves, JMP FAR direct is unsupported.

The solution, of course, is to make sure you switch to a flat SS before entering long mode. If your compatibility/legacy mode stack is non-flat, then it will be of course necessary to adjust it as entering (and preferably reload SS as the null selector, as this makes various book keeping tasks simpler).
Post Reply