to relocate or not to relocate ?

Pype.Clicker · Post by **Pype.Clicker** » Sun Aug 31, 2003 1:25 pm

Hi dudes ... a few lines of code before starting the development of user-level support (i mean, real support of programs), i'd like to probe the overall mood ...

Here's the problem: you have user programs and shared libraries that will be loaded at run-time (or load-time, it doesn't really care here). Think of it as DLLs or .SO files ... In order to be useful, such shared pieces of code need to access 'static data' for most of them (would those data just be strings, for instance), but as you know, there's nothing like eip-relative data addressing on the Intel ... thus you can't really make your code independent from the place you'll load it at.

And here are the options:

gcc-like position independent code: this technique is provided in ELF file format when you enable -fPIC. It actually generates PIC by locking a generic register (ebx) to store a "pointer to current object file's static data", which can be adjusted by special auto-generated stub functions
Code: Select all
```
   export myfunction
myfunction:
   mov ebx, [<address of .o's static data will be written here at load time>]
   call _internal_myfunction
```
btw, locking a register means that you'll need to swap datas between registers and main memory more often, thus reducing the overall code efficiency, but as only the page that stores "trampolines" need to be relocated (for EBX setup), you can more efficiently reuse physical pages for the different instances of your shared library accross the system.
just relocated code: when the dynamic library is loaded at offset X (while it was compiled for offset Y), any offset to a variable is adjusted (the loader adds X-Y). The relocation is more efficient if performed lazily (i.e. when a page from the code miss, you load it from the file *and then* you relocate it, which prevent you loading the whole code and then have to swap the relocated file ...
However, if a program A loads L at offset Xa and a program B loads L at offset Xb, with Xa!=Xb, you can't reuse pages (so the "sharing" is nothing but a myth)
shared relocations: Each library would come with a 'preferred load address', which is the address it has been compiled for. The loader would need additionnal intelligence to create a mapping of libraries that would maximize the reusability (for instance, we see process A has loaded L@X and M@Y, so we'll load M at offset Y too, even if this means to have a hole of sizeof(L) in front of it, so that we can share .text pages of A's instance of M ...

Which one do you think its best ? do someone know if Windows has some dark magic to implement relocation (i think Win32 DLL are relocation-based because of what i read from COFF file format, but i may be wrong) in an efficient way ?

Slasher · Post by **Slasher** » Sun Aug 31, 2003 4:17 pm

Why not pre-process the file ie .o coff file or elf file,saving it as your own format with a small header with offsets that point to parts of the file that would need the base address where the file is loaded to be added to them.
i.e if you load the file at x,then all values pointed to by the pointers in the head will nedd to have x added to them. I like this cause its simple and easy to implement. All we would have to do is learn how to pass coff ,elf or any other format to identify relocatable items which we then store the offsets in a header.
Any comments?

Pype.Clicker · Post by **Pype.Clicker** » Mon Sep 01, 2003 1:08 am

that's already what i do with kernel modules and indeed it works pretty efficiently ... however, it has a major drawback (that i think i explained in my above post) when you consider loading something like GuiSupport.dynLib ... as you cannot assume that all the processes will have the same memory organization (i.e. process A could require GuiSupport, ThreadSupport, NetworkSupport while process B could require Network, Gui, Compress and CompressedGraphicFormats ...), you may have GuiSupport present N times in memory on distinct physical frames because N processes couldn't agree on the location it should go ...

sounds clearer ?

Solar · Post by **Solar** » Mon Sep 01, 2003 2:53 am

Hmm... I like neither of the options; if forced, I'd vote for the gcc type PIC. However, no-one is forcing us, right?

I'd spend some more thinking on option b) - would it be possible to tweak the memory management in a way that a library is always mapped to the same address, regardless of the process?

Hm. Not really nice, since you'd have to reserve quite some part of the address space for libraries, then...

Interesting problem here, Pype. Are those three options really exhaustive? No other, more "perfect" way of doing it?

Pype.Clicker · Post by **Pype.Clicker** » Mon Sep 01, 2003 4:00 am

Well, i'm of course unsure whether they're exhaustive or not. I've been thinking at using x86 segments to help (for instance, if each module had a dedicated data segment, you'd need no relocation, but this breaks the FLAT model assumed by GCC

) -- and hoping that the cost for reloading the segment descriptor would be compensated by the availability of one more generic register for the running code ...

i also may consider things like an 'libraries group', (for instance, two KDE applications are likely to use the same libraries while a KDE and a gnome application will have significantly different "memory map signature" ...)

This map signature, of course, is only useable when you can tell at compile-time (or at load time) which dynamic libraries will be needed and cannot be applied for things like plug-ins or Dynamic Loading of Objects on Demand (a major incoming concept for Clicker, allowing a service provider X to "push" the class for local management and wrapping of a service or a resource

This problem get me so confused that i think it's time for me to poll the opinion to filter out stupid ideas of mine and just keep the OneGoodThing (tm) before i enter the implementation.

Solar · Post by **Solar** » Mon Sep 01, 2003 5:31 am

Pype.Clicker wrote: ...but this breaks the FLAT model assumed by GCC

That's something I wanted to look into myself in the near future... so you're saying gcc cannot support a segmented memory model? Do you have any sources on gcc's ability / disability in this regard?

Tim · Post by **Tim** » Mon Sep 01, 2003 5:51 am

Solar wrote:That's something I wanted to look into myself in the near future... so you're saying gcc cannot support a segmented memory model? Do you have any sources on gcc's ability / disability in this regard?

It's just the way gcc, as and ld were designed. There's a section in one of the manuals which states that gcc/as/ld can only ever be ported to machines with 32-bit pointers and a flat memory model.

The biggest problem is that, without far pointers, C on x86 requires that SS address the same memory as DS.

Consider this example:

Code: Select all

int main(void)
{
    static char variable_in_data_segment[20];
    char variable_on_stack[20];
    strcpy(variable_in_data_segment, "Hello");
    strcpy(variable_on_stack, "World");
}

The strcpy function needs to work the same in both cases. With a flat memory model, this is easy, since the program's .data section and the thread's stack are just two addresses within the big flat address space. But in a segmented model, where things on the stack must be accessed through SS and things in the data segment must be accessed through DS, it breaks down. You either need two versions of strcpy, one which goes through SS and one which goes through DS, or you need to pass the segment as well as the offset.

The general solution is to make all pointers far. But this means that applications themselves need to be aware of the segment part of each pointer, which reduces the number of pointer-fiddling tricks you can do. And it also introduces either a segment override, or a segment register load, for every pointer access. Needless to say, too, that all pointers are 16 bits bigger -- bigger than any integer type in C89, which breaks most Unix-style programs which assume 32-bit pointers and a flat address space (which is where we came in). gcc/as/ld avoid the problem by ignoring it*.

However, if you do want to experiment with segmented memory models with 32-bit C code, you could use the Open Watcom compiler, which is the only compiler I've ever seen which lets you use far pointers in 32-bit protected mode.

--
* The underlying Unix philosophy: It's better to be small and broken than complex and correct.

Pype.Clicker · Post by **Pype.Clicker** » Mon Sep 01, 2003 5:52 am

well, this is by reverse-engineering (studying generated code) and looking up at the documentation (info page, mainly) : there's nothing like a "far pointer", which means a reference must be a single offset, except if you do nothing but passing it to an assembler-written function that know how to change a segment.

GCC never emits segment overrides ...

Maybe it'll be more easy to override with C++ operator overload and templates (like a far<...>), but i'm not sure of it at all ...

Slasher · Post by **Slasher** » Mon Sep 01, 2003 6:17 am

Can't we work with OS maintained tables. If an app loads a dynanic lib the os, first check the table (works like a compiler's symbol table) for the libraries name and loaded address pair. If none is found, the os records the lib's name and loaded address in the table for future reference. So the next app that tries or requests such a library will get the address where it was loaded from the Os. This eliminates multiple copies of the lib in memory.
Of course,there will be problems like the new app might have something else mapped there. In this case, we could go with pointers to the libs, so the apps get the pointers,which means tht the os could update the address pointed to by the pointers while the apps just use the same pointers. comments?

Solar · Post by **Solar** » Mon Sep 01, 2003 6:26 am

Pfff... there go some of my nicer ideas regarding stack safety, shared memory, and shared libraries...

Slasher · Post by **Slasher** » Mon Sep 01, 2003 6:40 am

I don't understand what you mean Solar. Is it a good idea or bad one?and please state why.

Pype.Clicker · Post by **Pype.Clicker** » Mon Sep 01, 2003 6:51 am

@code slasher: i think solar's response is targetted at my previous post on GCC's unability of handling segmented memory models.

@solar: there have gone some of mine too, but i don't lose confidence. If required, i'll work on a C/C++/java derivate that would be configurable enough to work in segmented environments. I believe that in object-oriented languages, it can be done in a much nicer way than in procedural programming (while it actually was a pain to deal with :-p )

tom1000000 · Post by **tom1000000** » Tue Sep 02, 2003 5:52 am

Hi,

There's no eip relative addressing on x86????

Yes there's no direct way to do it.

Couldn't you do:

call nextInstruction
nextInstruction:
pop ebx ; Address of nextInstruction is in ebx

Pype.Clicker · Post by **Pype.Clicker** » Tue Sep 02, 2003 6:48 am

indeed, tom. And that's how gcc produces PIC code (except it stores the result of %ebx in the global offset table (iirc) rather than re-computing it at each instruction.

No need to say that it is far slower to re-compute it everytime, and anyway this is something the *compiler* does, and thus cannot really be changed ...

Code: Select all

  call .next ;; will have a memory reference to access the stack
.next:
  pop ebx  ;; also access memory (stack)
  mov [ebx+<position-relative offset>],value ;; can't be paired with POP as it requires the value of EBX to be computed :(

a "relative addressing mode" would be something like

Code: Select all

   mov [eip+<position-relative offset>],value

Pype.Clicker · Post by **Pype.Clicker** » Tue Sep 02, 2003 7:05 am

Code Slasher wrote: In this case, we could go with pointers to the libs, so the apps get the pointers,which means tht the os could update the address pointed to by the pointers while the apps just use the same pointers. comments?

This sounds dangerous as the OS would have no way to prevent the application from copying the "lib pointer" somewhere (think about the lib returning a pointer to an internal static buffer -- this is crappy but done very often :-/ )

OSDev.org

to relocate or not to relocate ?

to relocate or not to relocate ?

Re:to relocate or not to relocate ?

Re:to relocate or not to relocate ?

Re:to relocate or not to relocate ?

Re:to relocate or not to relocate ?

Re:to relocate or not to relocate ?

Re:to relocate or not to relocate ?

Re:to relocate or not to relocate ?

Re:to relocate or not to relocate ?

Re:to relocate or not to relocate ?

Re:to relocate or not to relocate ?

Re:to relocate or not to relocate ?

Re:to relocate or not to relocate ?

Re:to relocate or not to relocate ?

Re:to relocate or not to relocate ?