Shared library position

Tomako · Post by **Tomako** » Thu Jul 31, 2008 5:45 am

Hello,

I'm trying to implement ELF shared library support, and shared library support in general

As I am not a native english speeker, it was very difficult for me to understand the ELF specifications (for example it took me a few days even to understand that sections and segments were not the same thing

)

Let me introduce my problem:
Let's say program A requires lib1 wich requires lib2

Program A is loaded at address 0x8000000, lib1 at address 0x90000000 and lib2 at address 0xA0000000
So everything is mapped in the same virtual memory and we start relocalisation:
Let's say that a function call in lib1 at address 0x9000002c requires a function found in lib2 at offset 0xA0000100
So we simply write "0xA0000100" at address 0x9000002c

Now let's start program B which also requires lib1 (and thus lib2)
Program B is loaded at address 0x9000000, and thus lib1 at 0xA0000000 and lib2 at 0xB0000000
The problem is : As lib1 shares the same physical memory in virtual memories A and B, its offset 0x2c cannot point at the same time to addresses 0xA0000100 and 0xB0000100

I don't think I was clear but this problem must be quite simple and common despite the fact that I didn't find any solution neither in the wiki nor in this forum

Of course you could answer "you just have to load program B at 0x80000000" or "load lib1 at 0xB0000000 and lib2 at 0xA0000000" but that may not always be possible when there are multiple dependencies
I have also been reading PE specifications but it seems to work the same way as ELF

Solar · Post by **Solar** » Thu Jul 31, 2008 6:24 am

In a virtual memory system, executables are always loaded to the same (virtual) address (e.g. 0x00000000). Shared libraries are loaded to a different (virtual, read-only) address window (e.g. 0xc0000000 upwards), having the same address in all (virtual) address spaces.

Thus, program B cannot be loaded at (virtual) 0x9000000, because that is already occupied by lib1. It would be loaded to 0x80000000, exactly as program A.

(Disclaimer: Never got around to library support myself, so one of the more experienced guys here might tell you I'm an idiot and completely wrong.

)

jal · Post by **jal** » Thu Jul 31, 2008 7:16 am

Solar wrote:(Disclaimer: Never got around to library support myself, so one of the more experienced guys here might tell you I'm an idiot and completely wrong. ;-) )

I remember this being a big problem with loading DLLs in Windows, iirc the (virtual) start address is in the header, and two DLLs can have the same (or overlapping) addresses. The only way around it is having purely position independent code (so no direct address references, ever, but always relative to a pointer). I'm not sure whether many systems support this though, at least on x86 you'd have a register shortage to reserve a register for this (BP is already taken up by stack frame referencing, for example).

JAL

Tomako · Post by **Tomako** » Thu Jul 31, 2008 8:21 am

Thank you for your answer
I read again some articles and knowing that a library has always the same virtual address between different processes was helpful

I have been reading this: http://msdn.microsoft.com/en-us/magazine/cc301808.aspx
If I am right, this is how it works on Windows:
- every .exe ou .dll has a preferred loadable address (determined during compilation)
- relocation can be done directly in the file by the Windows installer which will do it by using preferred loadable addresses
- if two different DLL are to be loaded at the same location, the loader goes through the executable and modifies relocated addresses

I have just another question: do .data and .bss also share the same physical pages just like .text ?

Jeko · Post by **Jeko** » Thu Jul 31, 2008 1:00 pm

Tomako wrote:Thank you for your answer
I read again some articles and knowing that a library has always the same virtual address between different processes was helpful

I have been reading this: http://msdn.microsoft.com/en-us/magazine/cc301808.aspx
If I am right, this is how it works on Windows:
- every .exe ou .dll has a preferred loadable address (determined during compilation)
- relocation can be done directly in the file by the Windows installer which will do it by using preferred loadable addresses
- if two different DLL are to be loaded at the same location, the loader goes through the executable and modifies relocated addresses

I have just another question: do .data and .bss also share the same physical pages just like .text ?

Read http://en.wikipedia.org/wiki/Portable_Executable

I'm not english, but this is really easy to understand (in fact I traduced it to Italian Wikipedia http://it.wikipedia.org/wiki/Portable_Executable)

kscguru · Post by **kscguru** » Wed Aug 06, 2008 11:39 pm

Tomako wrote:(for example it took me a few days even to understand that sections and segments were not the same thing )

Don't feel bad. I am a native English speaker, spend a LOT of time (professionally) with compilers and linkers, and I didn't realize there was a difference between segments and sections until I started writing my own linker scripts. It's a very subtle point.

But to answer the question... there are three approaches to applying relocations to shared libraries in general use.

1) The classic Unix way. The shared library is mapped into each process, but set up as copy-on-write. So when the runtime loader goes through to apply relocations, the kernel makes a copy of that page for the specific process. This wastes quite a bit of memory, but is very simple to implement.

2) The Windows way. Every shared library is linked with a predefined load address. If the shared library is loaded into a process and the predefined load address is empty, that process shares the original copy. If the shared library has an address conflict, fall back to the above copy-on-write approach and apply relocations. This works well for Windows because almost all processes end up loading the same dozen system libraries (which Microsoft ensures do not overlap), so Windows recoups much of the cost of the above approach.

3) The modern Unix way, PIC = Position-Independent Code. The library is compiled so that all offsets are relative to the base of the shared library instead of absolute, which means no relocations are necessary at all! But the cost of doing this is that all code in the shared library needs to keep the base address of the shared library in a register, and on 32-bit x86, this takes most code from 6 registers to 5, so incurs real cost. (And the library has to be compiled this way.) The advantage is all users of the library can share a single copy.

Your last question is about some of the segments:
.text (executable code) - may or may not be shared, see above
.rodata - might as well be shared
.data, .bss - separate copy for each process, no sharing at all. Probably set up as copy-on-write.

Solar · Post by **Solar** » Thu Aug 07, 2008 12:31 am

kscguru wrote:But the cost of doing this is that all code in the shared library needs to keep the base address of the shared library in a register, and on 32-bit x86, this takes most code from 6 registers to 5, so incurs real cost.

Did anybody ever run a benchmark on PIC vs. normal code, or seen such benchmark results?

The "public face" of a x86 still has only half a dozen "general purpose" registers (ha!) like the 386 had. But internally those CPUs have become a completely different beast. I wouldn't at all be surprised if the microcode optimizations done in today's CPUs do away with the performance penalty of PIC code completely. Hence my interest in seeing benchmarks.

jal · Post by **jal** » Thu Aug 07, 2008 3:39 am

Solar wrote:The "public face" of a x86 still has only half a dozen "general purpose" registers (ha!) like the 386 had. But internally those CPUs have become a completely different beast. I wouldn't at all be surprised if the microcode optimizations done in today's CPUs do away with the performance penalty of PIC code completely. Hence my interest in seeing benchmarks.

The problem still remains you lose a GPR, and since almost any GPR is used either in some specific x86 instruction (eax, ecx, edx, esi, edi) or has traditional other uses (ebp), it needs regular pushing and popping to keep intact. ebx is the most likely candidate, but not having it at all may become cumbersome...

JAL

Solar · Post by **Solar** » Thu Aug 07, 2008 6:10 am

jal wrote:The problem still remains you lose a GPR, and since almost any GPR is used either in some specific x86 instruction (eax, ecx, edx, esi, edi) or has traditional other uses (ebp), it needs regular pushing and popping to keep intact.

The architecture of the x86 CPU line is no longer equivalent to what its CISC instruction set might tell you. That's called superscalar architecture; the Pentium was the first of that breed. The Cyrix 6x86 brought register renaming, the PII micro-op translation. I'd be interested in performance numbers, really.

OSDev.org

Shared library position

Shared library position

Re: Shared library position

Re: Shared library position

Re: Shared library position

Re: Shared library position

Re: Shared library position

Re: Shared library position

Re: Shared library position

Re: Shared library position