64-bit memory models
- NickJohnson
- Member
- Posts: 1249
- Joined: Tue Mar 24, 2009 8:11 pm
- Location: Sunnyvale, California
64-bit memory models
Hi,
I know the fundamental difference between -mcmodel=large and -mcmodel=small 64-bit memory models that GCC can emit, which is that the small memory model does 32-bit relative calls for functions and the large memory model uses absolute 64-bit pointers (which has a lot of overhead for PIC.)
However, there is a distinct -mcmodel=kernel option that seems to do the same thing as the small memory model, except that it works for the highest 2GB of virtual memory instead of the lowest. This seems strange, since both should be using relative jumps, so it shouldn't matter where in the address space they are linked, as long as everything is contained within a 2GB window. If I use PIC, can I load a -mcmodel=small or -mcmodel=kernel binary anywhere in the address space? Or am I missing something?
Also, I've heard that PIC on x86-64 is basically free because of IP-relative addressing. Is that correct?
I know the fundamental difference between -mcmodel=large and -mcmodel=small 64-bit memory models that GCC can emit, which is that the small memory model does 32-bit relative calls for functions and the large memory model uses absolute 64-bit pointers (which has a lot of overhead for PIC.)
However, there is a distinct -mcmodel=kernel option that seems to do the same thing as the small memory model, except that it works for the highest 2GB of virtual memory instead of the lowest. This seems strange, since both should be using relative jumps, so it shouldn't matter where in the address space they are linked, as long as everything is contained within a 2GB window. If I use PIC, can I load a -mcmodel=small or -mcmodel=kernel binary anywhere in the address space? Or am I missing something?
Also, I've heard that PIC on x86-64 is basically free because of IP-relative addressing. Is that correct?
Re: 64-bit memory models
Take a look at the code generation (here I'm guessing):
I didn't PIC my kernel, but IIRC -fPIC do not mix well with some -mcmodel switch, for drivers I just link with -fPIC without mcmodel:
Code: Select all
-mcmodel=large:
mov rdx, offset64
add rdi, [rdx]
mov rax, position64
jmp rax
-mcmodel=small
add rdi, [offset32]
jmp [position32]
-mcmodel=kernel (or DEFAULT REL in nasm)
works like small, but use the fact that address is sign-extended
-fPIC
in x86_64, generated code uses RIP relative addressing.
Things may work, but once the relative offset > 2G things get bad (or you can't link them at all).If I use PIC, can I load a -mcmodel=small or -mcmodel=kernel binary anywhere in the address space?
I didn't PIC my kernel, but IIRC -fPIC do not mix well with some -mcmodel switch, for drivers I just link with -fPIC without mcmodel:
Code: Select all
CFLAGS =-ffreestanding -masm=intel -std=c99 -O2 -fPIC -mno-red-zone \
-mno-mmx -mno-sse -mno-sse2 -mno-sse3 -mno-3dnow -c -D_$(ARCH) $(INCLUDE) -Werror $(CWARN)
- NickJohnson
- Member
- Posts: 1249
- Joined: Tue Mar 24, 2009 8:11 pm
- Location: Sunnyvale, California
Re: 64-bit memory models
Does that mean that I can load a mcmodel=small binary anywhere as long as it has no external references, and it is smaller than 2GB?
Re: 64-bit memory models
I'm not sure but may I ask a few questions so that I(and other people in the forum) can give some advices on how that is normally done:Does that mean that I can load a mcmodel=small binary anywhere as long as it has no external references, and it is smaller than 2GB?
1. Which part you want to be PIC? (eg. the kernel(startup stages), the whole kernel, drivers, applications, etc)
2. What is the address layout of your design? this may place some constrains.
eg. kernel @ -2GB, page structure @ -123G, drivers @ -1T, application 0-1234G, etc
Re: 64-bit memory models
Hi,
For "-mcmodel=small", does the compiler ever generate something like "mov eax, address_of_function" (note: zero extended into RAX)?
Similarly; for "-mcmodel=kernel", does the compiler ever generate something like "movsx rax, dword address_of_function"?
Cheers,
Brendan
Calls/returns and jumps are not the only cases where the (absolute or relative) address of code matters - you're forgetting function pointers.NickJohnson wrote:Does that mean that I can load a mcmodel=small binary anywhere as long as it has no external references, and it is smaller than 2GB?
For "-mcmodel=small", does the compiler ever generate something like "mov eax, address_of_function" (note: zero extended into RAX)?
Similarly; for "-mcmodel=kernel", does the compiler ever generate something like "movsx rax, dword address_of_function"?
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: 64-bit memory models
You can put it anywhere if you use -fPIC on a x86-64 processor. Your code will just default to using RIP-relative addressing for its data as well, which might entail some amount of overhead. I've been compiling my kernel this way for ages though and it's definitely possible.NickJohnson wrote:Does that mean that I can load a mcmodel=small binary anywhere as long as it has no external references, and it is smaller than 2GB?
Reserved for OEM use.
- Owen
- Member
- Posts: 1700
- Joined: Fri Jun 13, 2008 3:21 pm
- Location: Cambridge, United Kingdom
- Contact:
Re: 64-bit memory models
-mcmodel=small will do the following for static data references:
-fPIC will do the following:
-mcmodel=small is slightly more efficient in some cases, because it makes the compiler free to "abuse" lea for more complex pointer arithmetic and such, and allows base addresses to be embedded in instructions when otherwise they would have to be precomputed
Code: Select all
mov $symbol, register
Code: Select all
lea symbol(%rip), register
- NickJohnson
- Member
- Posts: 1249
- Joined: Tue Mar 24, 2009 8:11 pm
- Location: Sunnyvale, California
Re: 64-bit memory models
Ah, of course. It's loading absolute locations of data and code using 32-bit immediates, instead of 64-bit immediates or 32-bit offsets from RIP.Brendan wrote:Calls/returns and jumps are not the only cases where the (absolute or relative) address of code matters - you're forgetting function pointers.
Thanks. I'll probably just compile certain things as PIC instead; the overhead is probably negligible.
I have a bit of an odd layout: the kernel is split into two halves: one that manages threading/paging/interrupts and lives at -2GB (with a Linux-style identity-mapped region for its data, including paging structures); and the other that does everything else, runs in ring 1, and probably lives somewhere around -1TB. Userspace is the lower half, as usual. The first half is going to be compiled mcmodel=kernel, and the rest of the system is probably going to be PIC.bluemoon wrote:2. What is the address layout of your design? this may place some constrains.
eg. kernel @ -2GB, page structure @ -123G, drivers @ -1T, application 0-1234G, etc
Re: 64-bit memory models
In that case it's similar with my kernel:
The kernel(and critical data) lives at -2GB which can be easily accessed by kernel code,
and there is zones on other regions (drivers, global memory, etc) which is loaded from -fPIC modules or just allocated memory.
Note that the kernel can still directly access anywhere with a 64-bit pointer (ie. void* p = (void*)0xFFFFFFFF00badbad;)
Code: Select all
// Reserved: -0.5GB ~ TOP
// ----------------------------------------------
// MMU Page Allocator (-1GB ~ -0.5GB)
// for 256GB memory there is 67108864 x 4K pages, requires 512 MiB
// ----------------------------------------------
#define KADDR_MMU_STACK (0xFFFFFFFFC0000000)
// Kernel Zone (-2GB ~ -1GB)
// ----------------------------------------------
#define KADDR_ZERO_VMA (0xFFFFFFFF80000000)
#define KADDR_KERNEL_VMA (KADDR_ZERO_VMA + KADDR_KERNEL_PMA)
#define KADDR_BOOTDATA (KADDR_ZERO_VMA + 0x0600)
#define KADDR_PMA(x) (((uintptr_t)(x)) - KADDR_ZERO_VMA)
// Global Resource (-3GB ~ -2GB)
// Frame buffer, DMA Buffers, MMIO
// ----------------------------------------------
#define KADDR_GLOBAL_RESOURCE (0xFFFFFFFF40000000)
// Kernel Modules (-4GB ~ -3GB)
// ----------------------------------------------
#define KADDR_DRIVER (0xFFFFFFFF00000000)
// MMU recursive mapping (-512GB ~ -256GB)
// ----------------------------------------------
#define MMU_RECURSIVE_SLOT (510UL)
// User-space address layout (0MB ~ 256GB)
// ----------------------------------------------
and there is zones on other regions (drivers, global memory, etc) which is loaded from -fPIC modules or just allocated memory.
Note that the kernel can still directly access anywhere with a 64-bit pointer (ie. void* p = (void*)0xFFFFFFFF00badbad;)
- NickJohnson
- Member
- Posts: 1249
- Joined: Tue Mar 24, 2009 8:11 pm
- Location: Sunnyvale, California
Re: 64-bit memory models
Hi again,
One more issue I've been having with PIC on x86_64. I'd like to be able to compile the kernel with PIC in such a way that it has no GOT to fix up (which should be technically possible, as the kernel makes no external references, so the x86_64 can use IP-relative addressing instead) but GCC is generating a GOT for every C file, and then is using it to reference data if that data is not declared static. Is there a way to force GCC to use link-time relocations instead of GOT lookups to access data, when using PIC on x86_64?
Edit: Basically what I'm asking is, why is a GOT still necessary for a PIC binary that doesn't reference data outside of itself? It seems like the IP-relative addressing would make it possible not to use one, but GCC still emits and uses one for internal linkage (but only for global symbols). Is there a way to turn that off? I've been searching all day and can't find a solution to this.
One more issue I've been having with PIC on x86_64. I'd like to be able to compile the kernel with PIC in such a way that it has no GOT to fix up (which should be technically possible, as the kernel makes no external references, so the x86_64 can use IP-relative addressing instead) but GCC is generating a GOT for every C file, and then is using it to reference data if that data is not declared static. Is there a way to force GCC to use link-time relocations instead of GOT lookups to access data, when using PIC on x86_64?
Edit: Basically what I'm asking is, why is a GOT still necessary for a PIC binary that doesn't reference data outside of itself? It seems like the IP-relative addressing would make it possible not to use one, but GCC still emits and uses one for internal linkage (but only for global symbols). Is there a way to turn that off? I've been searching all day and can't find a solution to this.
Re: 64-bit memory models
I'd verify that the linker is actually emitting them in your executable file. It's been a while since I poked around with this, but iirc the linker should clean most of it up if you're doing a static linkage at the end anyways. I do recall running into some problems with function pointers as GCC will shove these into the GOT automatically in a PIC linkage model.
Reserved for OEM use.
- NickJohnson
- Member
- Posts: 1249
- Joined: Tue Mar 24, 2009 8:11 pm
- Location: Sunnyvale, California
Re: 64-bit memory models
It does seem to be using the GOT in the final executable. There is no dynamic section or relocations, but the instruction used to get the address of a global variable (to be passed to a print function) is a mov rdx, [rip+offset_to_GOT_entry] instead of a lea rdx, [rip+offset_to_data] . I thought it might fix them up as well, but I can't see a way that the linker could modify an already-generated mov instruction to a lea instruction, so I'm not even sure that's technically possible. I definitely need to get GCC to emit PC32 instead of GOTPCREL relocations when generating the object files.
- Owen
- Member
- Posts: 1700
- Joined: Fri Jun 13, 2008 3:21 pm
- Location: Cambridge, United Kingdom
- Contact:
Re: 64-bit memory models
Theres probably a flag to configure this, but here you're running into the Unix Way of linking, in which everything goes in the GOT in case some external library wants to override it...
Try -fPIE?
Try -fPIE?
- NickJohnson
- Member
- Posts: 1249
- Joined: Tue Mar 24, 2009 8:11 pm
- Location: Sunnyvale, California
Re: 64-bit memory models [SOLVED]
Yeah, I tried -fPIE as well. No combination of obvious-looking code generation flags documented by GCC seem to make a difference.
Edit: Figured it out. As you said, the compiler was putting entries in the GOT in case they needed to be used/overridden by other objects; setting -fvisibility=hidden makes GCC realize that you don't want to do any dynamic linking afterward, and fixes the problem.
In short:gives you PIC binaries with only IP-relative data addressing and no GOT. The GCC docs claim there should be no issues with extern.
Edit: Figured it out. As you said, the compiler was putting entries in the GOT in case they needed to be used/overridden by other objects; setting -fvisibility=hidden makes GCC realize that you don't want to do any dynamic linking afterward, and fixes the problem.
In short:
Code: Select all
gcc -fPIC -fvisibility=hidden
- Owen
- Member
- Posts: 1700
- Joined: Fri Jun 13, 2008 3:21 pm
- Location: Cambridge, United Kingdom
- Contact:
Re: 64-bit memory models
That should still use the GOT for any symbols not defined in the same translation unit...