C++ exception from shared object causes abort()

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
User avatar
max
Member
Member
Posts: 616
Joined: Mon Mar 05, 2012 11:23 am
Libera.chat IRC: maxdev
Location: Germany
Contact:

C++ exception from shared object causes abort()

Post by max »

When throwing an exception locally, everything works fine, but when my shared library throws an exception (did a simple "throw 25" in my executable and "catch(int e)" in my executable), the abort() function gets called.

I feel this could have something to do with the GLOB_DAT/COPY relocations of the clobbered int type _ZTIi. So I checked my objects. My executable has the following entries:

Code: Select all

.rel.dyn
0804edf8  00000b05 R_386_COPY        0804edf8   _ZTIi

.dynsym
    11: 0804edf8     8 OBJECT  WEAK   DEFAULT   18 _ZTIi

.symtab
   231: 0804edf8     8 OBJECT  WEAK   DEFAULT   18 _ZTIi
My shared library is loaded to 0x0808E000 and contains this:

Code: Select all

.rel.dyn
00016d68  0000b901 R_386_32          00016d7c   _ZTIi    => after load 0x080A4D68
00016d78  0000b901 R_386_32          00016d7c   _ZTIi    => after load 0x080A4D78
0001d58c  0000b906 R_386_GLOB_DAT    00016d7c   _ZTIi    => after load 0x080AB58C

.dynsm
   185: 00016d7c     8 OBJECT  WEAK   DEFAULT   10 _ZTIi => after load 0x080A4D7C

.symtab
   501: 00016d7c     8 OBJECT  WEAK   DEFAULT   10 _ZTIi => after load 0x080A4D7C
When relocating those, I do the following:

1. Apply R_386_32 relocations in shared object by copying value 0x0804EDF8 (position of _ZTIi in the executable) to addresses 0x080A4D68 and 0x080A4D78
2. Apply R_386_GLOB_DAT relocation in shared object by copying value 0x0804EDF8 (position of _ZTIi in the executable) to the position 0x080AB58C (offset from GLOB_DAT)
3. Apply R_386_COPY relocation in executable by copying 8 bytes from 0x080A4D7C (position of _ZTIi in the shared library) to the position 0x0804EDF8 (position of _ZTIi in the executable)

Is this way of performing the GLOB_DAT/COPY relocations correct? If it is, what else could cause the runtime to call abort() when throwing the exception?

Thanks in advance!!
Korona
Member
Member
Posts: 1000
Joined: Thu May 17, 2007 1:27 pm
Contact:

Re: C++ exception from shared object causes abort()

Post by Korona »

Do you support dl_iterate_phdr? IIRC that's how stack unwinding finds the correct DSO (but it might make sense to check libgcc how the exact mechanism works).
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
nullplan
Member
Member
Posts: 1796
Joined: Wed Aug 30, 2017 8:24 am

Re: C++ exception from shared object causes abort()

Post by nullplan »

So, I looked up how musl is doing it, and it is a bit confusing. But ultimately you are handling the relocations correctly. For some reason, when looking up the symbol for a REL_COPY relocation, you do not look in the executable. Therefore the target of the relocation is not found there but in the library. Otherwise the symbol is found in the executable. The difference between R_386_32 and R_386_GLOB_DAT is, the former has an addend value (whatever is written at the relocation address), whereas the latter does not. No idea why.

So, this is apparently not the problem you have. What else could it be?
Carpe diem!
User avatar
max
Member
Member
Posts: 616
Joined: Mon Mar 05, 2012 11:23 am
Libera.chat IRC: maxdev
Location: Germany
Contact:

Re: C++ exception from shared object causes abort()

Post by max »

Thanks for your help.
Korona wrote:Do you support dl_iterate_phdr? IIRC that's how stack unwinding finds the correct DSO (but it might make sense to check libgcc how the exact mechanism works).
I‘ll look into that, I didn‘t think there is something needed that isn‘t already there, because for static executables they work and I assumed relocations are enough. But this indeed looks interesting, might be the cause.
nullplan wrote:So, I looked up how musl is doing it, and it is a bit confusing. But ultimately you are handling the relocations correctly. For some reason, when looking up the symbol for a REL_COPY relocation, you do not look in the executable. Therefore the target of the relocation is not found there but in the library. Otherwise the symbol is found in the executable. The difference between R_386_32 and R_386_GLOB_DAT is, the former has an addend value (whatever is written at the relocation address), whereas the latter does not. No idea why.

So, this is apparently not the problem you have. What else could it be?
Yes the specification basically states, for the COPY relocations the symbol will occur both in the executable as well as the shared object, and you have to copy the contents from the shared object to the executable. That‘s what I‘m doing in my elf loader. Looks and feels right to me.

Why I was thinking that the issue could be related to that - if there is no catch for some type of exception, the runtime will abort() at some point. And so I thought that maybe for some reason the thrown type does not match the catch type... that‘s why I thought it could be relocation related.

I‘ll probably have to step through that with gdb to see where it goes wrong.

EDIT: Thought I‘d leave this link here it seems to include some interesting infos on this: https://gcc.gnu.org/ml/gcc-help/2007-01/msg00118.html
User avatar
max
Member
Member
Posts: 616
Joined: Mon Mar 05, 2012 11:23 am
Libera.chat IRC: maxdev
Location: Germany
Contact:

Re: C++ exception from shared object causes abort()

Post by max »

I just looked at the GCC docs and it says this:
There are several situations in which an application should use the shared libgcc instead of the static version. The most common of these is when the application wishes to throw and catch exceptions across different shared libraries. In that case, each of the libraries as well as the application itself should use the shared libgcc.
I think the problem is related to the question I posted a few days ago about missing symbols.

I currently don‘t have a shared libgcc so I‘ll build one and try it out with that.
User avatar
max
Member
Member
Posts: 616
Joined: Mon Mar 05, 2012 11:23 am
Libera.chat IRC: maxdev
Location: Germany
Contact:

Re: C++ exception from shared object causes abort()

Post by max »

Turns out, the reason that this was not working is that I linked libgcc.a statically. This was also the reason that I was missing some weak symbols because the shared libraries where trying to use symbols from the libgcc-a, which where hidden in the executable object on runtime and could therefore not be linked.

When using "-shared-libgcc" and thus linking with libgcc_s.so.1, it can all be linked correctly and everything works as expected.
Post Reply