64-bit memory models

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
User avatar
NickJohnson
Member
Member
Posts: 1249
Joined: Tue Mar 24, 2009 8:11 pm
Location: Sunnyvale, California

64-bit memory models

Post by NickJohnson »

Hi,

I know the fundamental difference between -mcmodel=large and -mcmodel=small 64-bit memory models that GCC can emit, which is that the small memory model does 32-bit relative calls for functions and the large memory model uses absolute 64-bit pointers (which has a lot of overhead for PIC.)

However, there is a distinct -mcmodel=kernel option that seems to do the same thing as the small memory model, except that it works for the highest 2GB of virtual memory instead of the lowest. This seems strange, since both should be using relative jumps, so it shouldn't matter where in the address space they are linked, as long as everything is contained within a 2GB window. If I use PIC, can I load a -mcmodel=small or -mcmodel=kernel binary anywhere in the address space? Or am I missing something?

Also, I've heard that PIC on x86-64 is basically free because of IP-relative addressing. Is that correct?
User avatar
bluemoon
Member
Member
Posts: 1761
Joined: Wed Dec 01, 2010 3:41 am
Location: Hong Kong

Re: 64-bit memory models

Post by bluemoon »

Take a look at the code generation (here I'm guessing):

Code: Select all

-mcmodel=large:
  mov rdx, offset64
  add rdi, [rdx]
  mov rax, position64
  jmp rax

-mcmodel=small
  add rdi, [offset32]
  jmp [position32]

-mcmodel=kernel (or DEFAULT REL in nasm)
  works like small, but use the fact that address is sign-extended

-fPIC
  in x86_64, generated code uses RIP relative addressing.
If I use PIC, can I load a -mcmodel=small or -mcmodel=kernel binary anywhere in the address space?
Things may work, but once the relative offset > 2G things get bad (or you can't link them at all).

I didn't PIC my kernel, but IIRC -fPIC do not mix well with some -mcmodel switch, for drivers I just link with -fPIC without mcmodel:

Code: Select all

CFLAGS   =-ffreestanding -masm=intel -std=c99 -O2 -fPIC -mno-red-zone \
          -mno-mmx -mno-sse -mno-sse2 -mno-sse3 -mno-3dnow -c -D_$(ARCH) $(INCLUDE) -Werror $(CWARN)
User avatar
NickJohnson
Member
Member
Posts: 1249
Joined: Tue Mar 24, 2009 8:11 pm
Location: Sunnyvale, California

Re: 64-bit memory models

Post by NickJohnson »

Does that mean that I can load a mcmodel=small binary anywhere as long as it has no external references, and it is smaller than 2GB?
User avatar
bluemoon
Member
Member
Posts: 1761
Joined: Wed Dec 01, 2010 3:41 am
Location: Hong Kong

Re: 64-bit memory models

Post by bluemoon »

Does that mean that I can load a mcmodel=small binary anywhere as long as it has no external references, and it is smaller than 2GB?
I'm not sure but may I ask a few questions so that I(and other people in the forum) can give some advices on how that is normally done:

1. Which part you want to be PIC? (eg. the kernel(startup stages), the whole kernel, drivers, applications, etc)
2. What is the address layout of your design? this may place some constrains.
eg. kernel @ -2GB, page structure @ -123G, drivers @ -1T, application 0-1234G, etc
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: 64-bit memory models

Post by Brendan »

Hi,
NickJohnson wrote:Does that mean that I can load a mcmodel=small binary anywhere as long as it has no external references, and it is smaller than 2GB?
Calls/returns and jumps are not the only cases where the (absolute or relative) address of code matters - you're forgetting function pointers.

For "-mcmodel=small", does the compiler ever generate something like "mov eax, address_of_function" (note: zero extended into RAX)?

Similarly; for "-mcmodel=kernel", does the compiler ever generate something like "movsx rax, dword address_of_function"?


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Cognition
Member
Member
Posts: 191
Joined: Tue Apr 15, 2008 6:37 pm
Location: Gotham, Batmanistan

Re: 64-bit memory models

Post by Cognition »

NickJohnson wrote:Does that mean that I can load a mcmodel=small binary anywhere as long as it has no external references, and it is smaller than 2GB?
You can put it anywhere if you use -fPIC on a x86-64 processor. Your code will just default to using RIP-relative addressing for its data as well, which might entail some amount of overhead. I've been compiling my kernel this way for ages though and it's definitely possible.
Reserved for OEM use.
User avatar
Owen
Member
Member
Posts: 1700
Joined: Fri Jun 13, 2008 3:21 pm
Location: Cambridge, United Kingdom
Contact:

Re: 64-bit memory models

Post by Owen »

-mcmodel=small will do the following for static data references:

Code: Select all

    mov $symbol, register
-fPIC will do the following:

Code: Select all

    lea symbol(%rip), register
-mcmodel=small is slightly more efficient in some cases, because it makes the compiler free to "abuse" lea for more complex pointer arithmetic and such, and allows base addresses to be embedded in instructions when otherwise they would have to be precomputed
User avatar
NickJohnson
Member
Member
Posts: 1249
Joined: Tue Mar 24, 2009 8:11 pm
Location: Sunnyvale, California

Re: 64-bit memory models

Post by NickJohnson »

Brendan wrote:Calls/returns and jumps are not the only cases where the (absolute or relative) address of code matters - you're forgetting function pointers.
Ah, of course. It's loading absolute locations of data and code using 32-bit immediates, instead of 64-bit immediates or 32-bit offsets from RIP. #-o

Thanks. I'll probably just compile certain things as PIC instead; the overhead is probably negligible.
bluemoon wrote:2. What is the address layout of your design? this may place some constrains.
eg. kernel @ -2GB, page structure @ -123G, drivers @ -1T, application 0-1234G, etc
I have a bit of an odd layout: the kernel is split into two halves: one that manages threading/paging/interrupts and lives at -2GB (with a Linux-style identity-mapped region for its data, including paging structures); and the other that does everything else, runs in ring 1, and probably lives somewhere around -1TB. Userspace is the lower half, as usual. The first half is going to be compiled mcmodel=kernel, and the rest of the system is probably going to be PIC.
User avatar
bluemoon
Member
Member
Posts: 1761
Joined: Wed Dec 01, 2010 3:41 am
Location: Hong Kong

Re: 64-bit memory models

Post by bluemoon »

In that case it's similar with my kernel:

Code: Select all

// Reserved: -0.5GB ~ TOP
// ----------------------------------------------

// MMU Page Allocator (-1GB ~ -0.5GB)
// for 256GB memory there is 67108864 x 4K pages, requires 512 MiB
// ----------------------------------------------
#define KADDR_MMU_STACK         (0xFFFFFFFFC0000000)

// Kernel Zone (-2GB ~ -1GB)
// ----------------------------------------------
#define KADDR_ZERO_VMA          (0xFFFFFFFF80000000)
#define KADDR_KERNEL_VMA        (KADDR_ZERO_VMA + KADDR_KERNEL_PMA)
#define KADDR_BOOTDATA          (KADDR_ZERO_VMA + 0x0600)
#define KADDR_PMA(x)            (((uintptr_t)(x)) - KADDR_ZERO_VMA)

// Global Resource (-3GB ~ -2GB)
// Frame buffer, DMA Buffers, MMIO
// ----------------------------------------------
#define KADDR_GLOBAL_RESOURCE   (0xFFFFFFFF40000000)

// Kernel Modules (-4GB ~ -3GB)
// ----------------------------------------------
#define KADDR_DRIVER            (0xFFFFFFFF00000000)

// MMU recursive mapping (-512GB ~ -256GB)
// ----------------------------------------------
#define MMU_RECURSIVE_SLOT      (510UL)

// User-space address layout (0MB ~ 256GB) 
// ----------------------------------------------
The kernel(and critical data) lives at -2GB which can be easily accessed by kernel code,
and there is zones on other regions (drivers, global memory, etc) which is loaded from -fPIC modules or just allocated memory.
Note that the kernel can still directly access anywhere with a 64-bit pointer (ie. void* p = (void*)0xFFFFFFFF00badbad;)
User avatar
NickJohnson
Member
Member
Posts: 1249
Joined: Tue Mar 24, 2009 8:11 pm
Location: Sunnyvale, California

Re: 64-bit memory models

Post by NickJohnson »

Hi again,

One more issue I've been having with PIC on x86_64. I'd like to be able to compile the kernel with PIC in such a way that it has no GOT to fix up (which should be technically possible, as the kernel makes no external references, so the x86_64 can use IP-relative addressing instead) but GCC is generating a GOT for every C file, and then is using it to reference data if that data is not declared static. Is there a way to force GCC to use link-time relocations instead of GOT lookups to access data, when using PIC on x86_64?

Edit: Basically what I'm asking is, why is a GOT still necessary for a PIC binary that doesn't reference data outside of itself? It seems like the IP-relative addressing would make it possible not to use one, but GCC still emits and uses one for internal linkage (but only for global symbols). Is there a way to turn that off? I've been searching all day and can't find a solution to this.
Cognition
Member
Member
Posts: 191
Joined: Tue Apr 15, 2008 6:37 pm
Location: Gotham, Batmanistan

Re: 64-bit memory models

Post by Cognition »

I'd verify that the linker is actually emitting them in your executable file. It's been a while since I poked around with this, but iirc the linker should clean most of it up if you're doing a static linkage at the end anyways. I do recall running into some problems with function pointers as GCC will shove these into the GOT automatically in a PIC linkage model.
Reserved for OEM use.
User avatar
NickJohnson
Member
Member
Posts: 1249
Joined: Tue Mar 24, 2009 8:11 pm
Location: Sunnyvale, California

Re: 64-bit memory models

Post by NickJohnson »

It does seem to be using the GOT in the final executable. There is no dynamic section or relocations, but the instruction used to get the address of a global variable (to be passed to a print function) is a mov rdx, [rip+offset_to_GOT_entry] instead of a lea rdx, [rip+offset_to_data] . I thought it might fix them up as well, but I can't see a way that the linker could modify an already-generated mov instruction to a lea instruction, so I'm not even sure that's technically possible. I definitely need to get GCC to emit PC32 instead of GOTPCREL relocations when generating the object files.
User avatar
Owen
Member
Member
Posts: 1700
Joined: Fri Jun 13, 2008 3:21 pm
Location: Cambridge, United Kingdom
Contact:

Re: 64-bit memory models

Post by Owen »

Theres probably a flag to configure this, but here you're running into the Unix Way of linking, in which everything goes in the GOT in case some external library wants to override it...

Try -fPIE?
User avatar
NickJohnson
Member
Member
Posts: 1249
Joined: Tue Mar 24, 2009 8:11 pm
Location: Sunnyvale, California

Re: 64-bit memory models [SOLVED]

Post by NickJohnson »

Yeah, I tried -fPIE as well. No combination of obvious-looking code generation flags documented by GCC seem to make a difference.

Edit: Figured it out. As you said, the compiler was putting entries in the GOT in case they needed to be used/overridden by other objects; setting -fvisibility=hidden makes GCC realize that you don't want to do any dynamic linking afterward, and fixes the problem.

In short:

Code: Select all

gcc -fPIC -fvisibility=hidden
gives you PIC binaries with only IP-relative data addressing and no GOT. The GCC docs claim there should be no issues with extern.
User avatar
Owen
Member
Member
Posts: 1700
Joined: Fri Jun 13, 2008 3:21 pm
Location: Cambridge, United Kingdom
Contact:

Re: 64-bit memory models

Post by Owen »

That should still use the GOT for any symbols not defined in the same translation unit...
Post Reply