C++, ld, and Solaris weirdness

Programming, for all ages and all languages.
Post Reply
User avatar
Colonel Kernel
Member
Member
Posts: 1437
Joined: Tue Oct 17, 2006 6:06 pm
Location: Vancouver, BC, Canada
Contact:

C++, ld, and Solaris weirdness

Post by Colonel Kernel »

I'm wondering if anyone can help me with a linker-related problem I'm having on Solaris 8...

There's a really old shared library being built with an equally old makefile... Once upon a time it only contained C code, but somewhat recently some C++ stuff has been added to it, including a few static objects.

Through a painful process of diagnosis, I have figured out that the ld switch '-B symbolic' is for some reason preventing the constructors of static C++ objects in the .so from being called. I have no idea why. The problem is, if I remove -B symbolic, some other things break somewhat mysteriously. One of my colleagues hypothesizes that a global variable declare in a static lib that is linked into the .so is conflicting with (being merged with...?) the same variable in another copy of the static lib that is linked into the executable that is loading the .so, and that -B symbolic may have been added to the makefile years ago specifically to address this problem.

If that was confusing, here is some ASCII art:

mylib.a defines global variable 'foo'.

a.out -> mylib.a -> foo <-------------------+
| |
+----> mysharedlib.so -> mylib.a -> foo ---+

I don't understand enough details to verify his hypothesis... I've read the ld man pages ad nauseum... Can anyone give me a simpler explanation (or a link to one... no pun intended)? :)
Top three reasons why my OS project died:
  1. Too much overtime at work
  2. Got married
  3. My brain got stuck in an infinite loop while trying to design the memory manager
Don't let this happen to you!
AR

Re:C++, ld, and Solaris weirdness

Post by AR »

I'm not sure how to fix it, but I have been studying the ELF ABI and Dynamic Linking recently.

The symbolic tag shouldn't stop the static constructors from working, the constructors are ment to be called in .init and destructors by .fini (try objdump-ing it to make sure they exist), -Bsymbolic causes the dynamic linker to link the GOT to symbols inside the library before searching the executable and other libraries (The default is to link to the function/variable in the executable before searching the libraries [including itself]).

You'll probably need to make sure there aren't 2 classes with the same name since the dynamic linker may link against the wrong one. Something to note about ELF is that it has an "executable namespace" as opposed to Windows PE which has individual namespaces for each library, this means that basically any variable/class/function with the same name as something else in any library, static or not, linked to the executable will conflict.
User avatar
Colonel Kernel
Member
Member
Posts: 1437
Joined: Tue Oct 17, 2006 6:06 pm
Location: Vancouver, BC, Canada
Contact:

Re:C++, ld, and Solaris weirdness

Post by Colonel Kernel »

Thanks for the info.

Do you know whether or not Solaris even uses ELF? I haven't been able to figure that out. In case it's relevant -- the problem is being observed on a Sparc machine... I haven't tried to repro it on Solaris x86 yet.

Also, do you know of a Sun equivalent for objdump? We're using Forte, not GCC.

One more thing... What you've described about the dynamic linker and "executable namespaces" makes sense, but does it also apply if the .so is being loaded via dlopen()? In the example I showed above, assume that a.out is loading mysharedlib.so using dlopen(). Does this problem with conflicting symbols still apply in this case? You can assume that a.out will not be trying to access foo via dlsym()...
Top three reasons why my OS project died:
  1. Too much overtime at work
  2. Got married
  3. My brain got stuck in an infinite loop while trying to design the memory manager
Don't let this happen to you!
User avatar
Colonel Kernel
Member
Member
Posts: 1437
Joined: Tue Oct 17, 2006 6:06 pm
Location: Vancouver, BC, Canada
Contact:

Re:C++, ld, and Solaris weirdness

Post by Colonel Kernel »

Ok, Solaris does use ELF... the relevant command is elfdump.

My question about dlopen() still stands though...
Top three reasons why my OS project died:
  1. Too much overtime at work
  2. Got married
  3. My brain got stuck in an infinite loop while trying to design the memory manager
Don't let this happen to you!
User avatar
Colonel Kernel
Member
Member
Posts: 1437
Joined: Tue Oct 17, 2006 6:06 pm
Location: Vancouver, BC, Canada
Contact:

Re:C++, ld, and Solaris weirdness

Post by Colonel Kernel »

Ok, I did an experiment, and dlopen() behaves the same way as "implicit" run-time linking. Bleah! Not the behaviour I would have expected...
Top three reasons why my OS project died:
  1. Too much overtime at work
  2. Got married
  3. My brain got stuck in an infinite loop while trying to design the memory manager
Don't let this happen to you!
Post Reply