Page 1 of 1

64-bit higher-half [SOLVED]

Posted: Tue Apr 24, 2012 4:45 pm
by bluemoon
I'm a bit confused on the 64-bit addressing model, in my startup code I setup identity PML4>PDPTE>PDE>PTE and refresh CR3.
This does fine. I double check with removing entry in PTE and it crash as expected, so the paging are in effect.

Then I try "higher half" it, it however crash when I jump to high address.

My code:

Code: Select all

cpu x86-64
bits 64

global bootstrap
extern kprintf, kmain

KERNEL_PMA      equ (0x00100000)
KERNEL_VMA      equ (0xFFFFFFFF80000000 + KERNEL_PMA)
%define PMA(x)  ((x) - KERNEL_VMA + KERNEL_PMA)


section .bss
; ----------------------------------------------
align 4096
k_PML4    resb    4096
k_PDPTE   resb    4096
k_PDE     resb    4096
k_PTE0    resb    4096
k_PTE1    resb    4096


section .data
; ----------------------------------------------
msg db 'HighHalf = %X', 0


section .text
; ----------------------------------------------
bootstrap:

    ; Setup page tables
    ; ----------------------------------------------
    mov     rdi, PMA(k_PML4)
    mov     qword [rdi], PMA(k_PDPTE) +1
    mov     qword [rdi+4096],   PMA(k_PDE) +1
    mov     qword [rdi+8192],   PMA(k_PTE0) +1
    ; mov     qword [rdi+8192+8], PMA(k_PTE1) +1
    ; higher half
    mov     qword [rdi+((KERNEL_VMA>>39)&0x1FF) *8], PMA(k_PDPTE) +1
    mov     qword [rdi+4096+((KERNEL_VMA>>30)&0x1FF) *8], PMA(k_PDE) +1

    mov     rdi, PMA(k_PTE0)
    mov     rcx, 512*2
    mov     rax, 3
.1:
    mov     [rdi], rax
    add     rax, 0x1000
    add     rdi, 8
    loop    .1

    mov     rdi, PMA(k_PML4)
    mov     cr3, rdi
    mov     rcx, qword .HigherHalf

; if i put hlt here, it halt with no problem, so the new page structure covered low addresses.
; hlt

    jmp     rcx ; crash here

.HigherHalf:
    mov     edi, 0xB8000
    mov     rax, 0x1F311F321F331F34
    mov     ecx, 500
    rep     stosq
    hlt
The related disassemble:

Code: Select all

Filling page structures:

ffffffff80100000 <.text>:
ffffffff80100000:	48 bf 00 40 10 00 00 	movabs $0x104000,%rdi
ffffffff80100007:	00 00 00
ffffffff8010000a:	48 c7 07 01 50 10 00 	movq   $0x105001,(%rdi)
ffffffff80100011:	48 c7 87 00 10 00 00 	movq   $0x106001,0x1000(%rdi)
ffffffff80100018:	01 60 10 00 
ffffffff8010001c:	48 c7 87 00 20 00 00 	movq   $0x107001,0x2000(%rdi)
ffffffff80100023:	01 70 10 00 
ffffffff80100027:	48 c7 87 f8 0f 00 00 	movq   $0x105001,0xff8(%rdi)
ffffffff8010002e:	01 50 10 00 
ffffffff80100032:	48 c7 87 f0 1f 00 00 	movq   $0x106001,0x1ff0(%rdi)

The jmp:
ffffffff80100077:	48 b9 83 00 10 80 ff 	movabs $0xffffffff80100083,%rcx
ffffffff8010007e:	ff ff ff 
ffffffff80100081:	ff e1                	jmpq   *%rcx

The higher Half code:
ffffffff80100083:	bf 00 80 0b 00       	mov    $0xb8000,%edi
ffffffff80100088:	48 b8 34 1f 33 1f 32 	movabs $0x1f311f321f331f34,%rax
ffffffff8010008f:	1f 31 1f 
ffffffff80100092:	b9 f4 01 00 00       	mov    $0x1f4,%ecx
ffffffff80100097:	f3 48 ab             	rep stos %rax,%es:(%rdi)
ffffffff8010009a:	f4                   	hlt    
And if that matter, I compile other .cpp with -mcmodel=kernel
the above code is assemble with nasm -f elf64
linked together with x86_64-elf-ld -nostdinc -nostdlib -nodefaultlibs -Tkernel.ld ...


EDIT: SOLVED

Re: 64-bit higher-half

Posted: Tue Apr 24, 2012 6:11 pm
by Rudster816
First off, you're assuming that k_PML4, k_PDPTE, etc will be consecutive in the BSS section. This is a very dangerous assumption, because if the linker decides to move stuff around to optimize the space, you'll run in to huge problems. Secondly, your're not giving the variables the correct names, which has made it very difficult for me to read (more on this later). I think the source of your problem is that you're using the same PML4\PDPTE for both the lower\upper halves.

The 4kb PML4 (whose pointer is stored in CR3) stores 512 PML4E's not PDPTE's. Each PML4 stores 512 PDTE's, each PDPTE stores 512 PDE's, each PDE stores 512 PTE's.
CR3: Maps all 256TB
PML4E: Maps 512GB
PDPTE: Maps 1GB
PDE: Maps 2MB
PTE: Maps 4KB

Your names imply you're storing 512 PDPTE's in CR3, but works because PML4E's\PDPTE's have similar semantics. If you want to map 0xFFFFFFF800000000 to 0x0000...0, you'll need to use two PML4E's (at index 0 and 511 for the lower and upper halves respectively). Then map the first PDPTE in PML4E[0] and the 510th PDPTE in PML4E[511] to the same PDPTE that identity maps however much space you desire.



My bootloader sets up higher half addressing for my kernel my 1:1 mapping 0GB->4GB Physical to the top 4GB Virtual 0xFFFFFFFF00000000->0xFFFFFFFFFFFFFFFF, which IMO is way more elegant (and easier to setup) than Linux's 2GB scheme. Going from Virtual to Physical is as easy as chopping off the top 32 bits, which can be accomplished in assembly by doing

Code: Select all

mov edi, edi
which zero's out the top half of RDI.

Here's the real mode assembly routine I use to setup paging in my bootloader. Note that it uses 2MB pages, not 4KB.

Code: Select all

InitPageTable:						; Inital 64 bit paging setup. Identity maps first 4GB of address space
	pushad							; It also maps the 0xFFFFFFFF00000000 -> 0xFFFFFFFFFFFFFFFF region to
	push es							; the first 4GB of physical address space
	mov ax, 0x5000
	mov es, ax
	mov cx, 0x4000
	xor di, di
	xor eax, eax
	rep stosd						; Zero out the 64kb region from 0x50000 -> 0x5FFFF inclusive
	
	mov DWORD [es:0x0], 0x51007		; PML4E for first 512GB of address space.
	mov DWORD [es:0xFF8], 0x52007	; PML4E for the last 512GB
	
	mov DWORD [es:0x1000], 0x53007	; Set 4 PDPTE entries for the first 4GB of address space
	mov DWORD [es:0x1008], 0x54007
	mov DWORD [es:0x1010], 0x55007
	mov DWORD [es:0x1018], 0x56007
	
	mov DWORD [es:0x2FE0], 0x53007
	mov DWORD [es:0x2FE8], 0x54007
	mov DWORD [es:0x2FF0], 0x55007
	mov DWORD [es:0x2FF8], 0x56007
	
	mov ebp, 0x3000
	mov ecx, 2048
	mov edx, 0x87					; The flags
	.Add2MBEntry:
		mov [es:ebp], edx
		add ebp, 8
		add edx, 0x200000			; 2MB
		dec ecx
		jnz .Add2MBEntry
	
	pop es
	popad
	ret

Re: 64-bit higher-half

Posted: Tue Apr 24, 2012 11:27 pm
by sounds
@bluemoon: Rudster816 is right on, double-check the things he suggested and you'll probably find the problem.

@Rudster816: Did you mean going from Virtual to Physical?
Rudster816 wrote:My bootloader sets up higher half addressing for my kernel my 1:1 mapping 0GB->4GB Physical to the top 4GB Virtual 0xFFFFFFFF00000000->0xFFFFFFFFFFFFFFFF, which IMO is way more elegant (and easier to setup) than Linux's 2GB scheme. Going from Physical to Virtual is as easy as chopping off the top 32 bits
If I was going from physical 0xB8000 (a useful address) to virtual 0xFFFFFFFF000B8000, I would need to turn on the top 32 bits.

If I was going from virtual to physical, I could just chop them off.

Still that's a nice opcode to remember. mov edi, edi is a clever way to zero out the top 32 bits.

Re: 64-bit higher-half

Posted: Wed Apr 25, 2012 4:01 am
by iansjack
Is that mov documented? Wouldn't

Code: Select all

movzx %edi, %rdi
be a safer choice?

Re: 64-bit higher-half

Posted: Wed Apr 25, 2012 5:27 am
by Rudster816
iansjack wrote:Is that mov documented? Wouldn't

Code: Select all

movzx %edi, %rdi
be a safer choice?
Yes, it's architecturally (all x64 chips must do it) defined behavior. When you touch a 32 bit register, the top half is zero'd out. You can see GCC exploit this by sending it a fragment like this

Code: Select all

unsigned long foo = 0x0123456789ABCDEF;
foo = foo & 0xDEAD;
You should see and Exx, 0xDEAD rather than and Rxx, 0xDEAD to avoid a REX prefix.
sounds wrote: @Rudster816: Did you mean going from Virtual to Physical?
Yes I did.

Re: 64-bit higher-half

Posted: Wed Apr 25, 2012 5:35 am
by Gigasoft
iansjack wrote:Is that mov documented?
That's certainly a tough question. Isn't there anything we can think of that would allow us to get to the bottom of this mystery once and for all? Perhaps there's some piece of evidence that has yet to turn up? Come on, think hard! I believe that you can do it!

Oh, I just remembered something... Way back in computer engineering school, I think the teacher mentioned something important. If I remember correctly, it was called a Man-Youll, or something... I don't know if it's even remotely relevant, but it might be worth looking into?

Re: 64-bit higher-half

Posted: Wed Apr 25, 2012 5:44 am
by Gigasoft
Rudster816 wrote:First off, you're assuming that k_PML4, k_PDPTE, etc will be consecutive in the BSS section. This is a very dangerous assumption, because if the linker decides to move stuff around to optimize the space, you'll run in to huge problems.
It doesn't. Linkers don't decide to move stuff around. These are just labels within a section, and the linker doesn't even know about them.

However, the "align 4096" directive is useless, as it just aligns data within the section. It should be "section .bss align=4096".

Re: 64-bit higher-half

Posted: Wed Apr 25, 2012 6:36 am
by Combuster
iansjack wrote:Is that mov documented? Wouldn't

Code: Select all

movzx %edi, %rdi
be a safer choice?
Not using movzx is the better choice. It is documented:
AMD Manual wrote:The high 32 bits of doubleword operands are zeroextended
to 64 bits, but the high bits of word and byte operands are not modified by operations in 64-
bit mode
Your movzx r64,r32 opcode should fail assembly because it doesn't even exist. Only a movsxd r64, rm32 exists for the explicit sign extended version.

They intentionally did that to prevent a source of stalls. If all 64 bits are written on both 32 and 64 bit operations the neither the processor nor the assembly need to mask out bits when converting between ints and pointers:

Code: Select all

mov rdi, [pointer]   ; 64-bit pointer
; xor rcx, rcx       ; not necessary, save 3 bytes.
mov ecx, [index]     ; 32-bit uint index. Has no dependency on existing rcx[32:63], and therefore no partial register stall.
mov eax, [rdi + rcx] ; return pointer[index];

Re: 64-bit higher-half

Posted: Wed Apr 25, 2012 9:32 am
by iansjack
Gigasoft wrote: Oh, I just remembered something... Way back in computer engineering school, I think the teacher mentioned something important. If I remember correctly, it was called a Man-Youll, or something...
Thanks for that fascinating information, which I obviously hadn't though of. I'm not sure what this "Man-Youll" of yours is, so instead I checked in the Intel Programmer's Reference. I couldn't find that behaviour documented, but it is 4,128 pages so I guess that I missed something.
Combuster wrote:Your movzx r64,r32 opcode should fail assembly because it doesn't even exist.
My bad. I was getting confused with the movsx instruction. It's an interesting difference between the way things work with 64-bit registers and 32-bit ones;

Code: Select all

mov %di, %di
certainly doesn't zero the top half of %edi.

Thank you Combuster for your polite response and your patient explanation.

Re: 64-bit higher-half

Posted: Wed Apr 25, 2012 11:07 am
by JAAman
iansjack wrote:
Gigasoft wrote: Oh, I just remembered something... Way back in computer engineering school, I think the teacher mentioned something important. If I remember correctly, it was called a Man-Youll, or something...
Thanks for that fascinating information, which I obviously hadn't though of. I'm not sure what this "Man-Youll" of yours is, so instead I checked in the Intel Programmer's Reference. I couldn't find that behaviour documented, but it is 4,128 pages so I guess that I missed something.
check (amongst other places) Intel 1:3.4.1.1 (General-Purpose Registers in 64-Bit Mode)
its also mentioned in volume#2 under the MOV instruction, and some other places

Re: 64-bit higher-half

Posted: Wed Apr 25, 2012 11:25 am
by bluemoon
Thanks for all the advice, I should clean up the code once it work.

But here's more weird thing:

1. crash even I do r64 absolute jump to low address (I double checked the address as 1Mib mark for the clear screen function.
ie. mov rcx, 0x001000XX; jmp rcx

2. I can write to high address with no problem.
ie mov rdi, foo <~ rdi = 0xffffffff8000XXXX
mov [rdi], rcx
It do not crash

3. I repeats (2), but removed the higher-half mapping from PML4E, it does crash as expected.


So, am I missing some setting for the 64-bit environment? Do I need to do anything to jump high or that indirect jump is not what I imagined?

PS. I should spend some time to make gdb work on mac for x86_64, I'm getting "packet too long" with my current configuration :(

Re: 64-bit higher-half

Posted: Wed Apr 25, 2012 11:27 am
by iansjack
JAAman wrote: check (amongst other places) Intel 1:3.4.1.1 (General-Purpose Registers in 64-Bit Mode)
its also mentioned in volume#2 under the MOV instruction, and some other places
Thanks for that. I obviously need new reading glasses as I still can't find it in the documentation of the MOV instruction in Vol 2, but I see it in the reference in Vol. 1.

I'll try to memorize the manuals before posting again! :oops:

Re: 64-bit higher-half (SOLVED)

Posted: Wed Apr 25, 2012 11:37 am
by bluemoon
OMG How can I missed this! Luckily I smell the bug from the 3 findings from last post. :P

The whole mystery is that 0xffffffff80100000 (ie kernel address) is not exactly the 1MiB loaded address.
There is ELF header stuff there.

A quick workaround is align the text section and have kernel base = 0xffffffff80100000 + 4096, until I have enough sleep and mood to fix that :roll:

Anyway, it worked now. Thanks all.

Re: 64-bit higher-half [SOLVED]

Posted: Wed Apr 25, 2012 12:44 pm
by bluemoon
I would like to share the code here:

Code: Select all

cpu x86-64
bits 64


global bootstrap
extern kprintf, kmain



KERNEL_PMA      equ (0x00100000)
ZERO_VMA        equ (0xFFFFFFFF80000000)
KERNEL_VMA      equ (ZERO_VMA + KERNEL_PMA)
%define PMA(x)  ((x) - ZERO_VMA)


section .bss
; ----------------------------------------------
align 4096
k_PML4E    resb    4096
k_PDPTE    resb    4096
k_PDE      resb    4096     ; cover 0~4MiB


section .data
; ----------------------------------------------


section .text
; ----------------------------------------------
bootstrap:

    ; Setup page tables
    ; ----------------------------------------------
    mov     rdi, PMA(k_PML4E)
    mov     rsi, rdi

    mov     qword [rdi], PMA(k_PDPTE) +3                                    ; First 512 GiB
    mov     qword [rdi + ((KERNEL_VMA>>39)&511)*8], PMA(k_PDPTE) +3         ; Kernel/Last 512 GiB

    mov     rdi, PMA(k_PDPTE)
    mov     qword [rdi], PMA(k_PDE) +3                                      ; 0~1 GiB
    mov     qword [rdi + ((KERNEL_VMA>>30)&511)*8], PMA(k_PDE) +3           ; Kernel's 1 GiB

    mov     rdi, PMA(k_PDE)
    mov     qword [rdi   ], 0x83                                            ; 0~2 MiB
    mov     qword [rdi +8], 0x83 + 0x200000                                 ; 2~4 MiB

    mov     cr3, rsi
    mov     rcx, .HigherHalf
    jmp     rcx

.HigherHalf:
    mov     rdi, k_PML4E
    mov     qword [rdi], 0                                                  ; Remove First 512 GiB
    mov     rdi, k_PDPTE
    mov     qword [rdi], 0                                                  ; Remove 0~1 GiB
    mov     cr3, rsi

    mov     rdi, ZERO_VMA+0xB8000
    mov     rax, 0x1F311F321F331F34
    mov     ecx, 500
    rep     stosq
    hlt