Page 1 of 1

Getting Clang to call my global constructors

Posted: Sat Sep 25, 2021 11:56 am
by PgrAm
Hey all, I am currently trying to use more C++ in my OS, so I figured I would start by using it in the user mode utilities. However I can't seem to get Clang to generate calls to the global constructors. After having read the article on the wiki, which seems to mostly deal with GCC except to say that clang doesn't require adding crtbegin/crtend to the command line, I made a simple test program.

Code: Select all

//test.cpp
class testClass {
	int x;
public:
	testClass()
	{
		x = 1;
	}
};

testClass t{};

int main(int argc, char** argv)
{
}
when compiled without optimization I should definitely get a call to the constructor.

I added a simple crti, crtn & crt0

Code: Select all

; x86 crti.asm
section .init
global _init
_init:
	push ebp
	mov ebp, esp

section .fini
global _fini
_fini:
	push ebp
	mov ebp, esp

Code: Select all

; x86 crtn.asm
section .init
	pop ebp
	ret

section .fini
	pop ebp
	ret

Code: Select all

//crt0.c
extern int main(int argc, char** argv);
extern void _init();
extern void _fini();

void _start()
{
	_init();

	main(0, 0);

	_fini();
}
and I compiled everything like so:

Code: Select all

clang -target i386-elf -c api/crt0.c -o build/api/crt0.o -ffreestanding
nasm -f elf api/crti.asm -o build/api/crti.o
nasm -f elf api/crtn.asm -o build/api/crtn.o

clang++ -target i386-elf -c test.cpp -o build/test.o

ld.lld build/api/crt0.o build/api/crti.o build/test.o build/api/crtn.o -o test.elf
However when I look at the output, Clang never seems to fill in the stuff that is supposed to be in crtbegin/crtend

Code: Select all

Disassembly of section .init:

00400114 <_init>:
  400114:	55                   	push   ebp
  400115:	89 e5                	mov    ebp,esp
  400117:	5d                   	pop    ebp
  400118:	c3                   	ret    

Disassembly of section .fini:

00400119 <_fini>:
  400119:	55                   	push   ebp
  40011a:	89 e5                	mov    ebp,esp
  40011c:	5d                   	pop    ebp
  40011d:	c3                   	ret    

Disassembly of section .eh_frame:

00400120 <.eh_frame>:
  400120:	14 00                	adc    al,0x0
  400122:	00 00                	add    BYTE PTR [eax],al
  400124:	00 00                	add    BYTE PTR [eax],al
  400126:	00 00                	add    BYTE PTR [eax],al
  400128:	01 7a 52             	add    DWORD PTR [edx+0x52],edi
  40012b:	00 01                	add    BYTE PTR [ecx],al
  40012d:	7c 08                	jl     400137 <_fini+0x1e>
  40012f:	01 1b                	add    DWORD PTR [ebx],ebx
  400131:	0c 04                	or     al,0x4
  400133:	04 88                	add    al,0x88
  400135:	01 00                	add    DWORD PTR [eax],eax
  400137:	00 1c 00             	add    BYTE PTR [eax+eax*1],bl
  40013a:	00 00                	add    BYTE PTR [eax],al
  40013c:	1c 00                	sbb    al,0x0
  40013e:	00 00                	add    BYTE PTR [eax],al
  400140:	40                   	inc    eax
  400141:	10 00                	adc    BYTE PTR [eax],al
  400143:	00 17                	add    BYTE PTR [edi],dl
  400145:	00 00                	add    BYTE PTR [eax],al
  400147:	00 00                	add    BYTE PTR [eax],al
  400149:	41                   	inc    ecx
  40014a:	0e                   	push   cs
  40014b:	08 85 02 42 0d 05    	or     BYTE PTR [ebp+0x50d4202],al
  400151:	53                   	push   ebx
  400152:	0c 04                	or     al,0x4
  400154:	04 00                	add    al,0x0
  400156:	00 00                	add    BYTE PTR [eax],al
  400158:	1c 00                	sbb    al,0x0
  40015a:	00 00                	add    BYTE PTR [eax],al
  40015c:	3c 00                	cmp    al,0x0
  40015e:	00 00                	add    BYTE PTR [eax],al
  400160:	50                   	push   eax
  400161:	10 00                	adc    BYTE PTR [eax],al
  400163:	00 0a                	add    BYTE PTR [edx],cl
  400165:	00 00                	add    BYTE PTR [eax],al
  400167:	00 00                	add    BYTE PTR [eax],al
  400169:	41                   	inc    ecx
  40016a:	0e                   	push   cs
  40016b:	08 85 02 42 0d 05    	or     BYTE PTR [ebp+0x50d4202],al
  400171:	46                   	inc    esi
  400172:	0c 04                	or     al,0x4
  400174:	04 00                	add    al,0x0
  400176:	00 00                	add    BYTE PTR [eax],al
  400178:	00 00                	add    BYTE PTR [eax],al
	...

Disassembly of section .text:

00401180 <__cxx_global_var_init>:
  401180:	55                   	push   ebp
  401181:	89 e5                	mov    ebp,esp
  401183:	50                   	push   eax
  401184:	8d 05 fc 31 40 00    	lea    eax,ds:0x4031fc
  40118a:	89 04 24             	mov    DWORD PTR [esp],eax
  40118d:	e8 2e 00 00 00       	call   4011c0 <_ZN9testClassC2Ev>
  401192:	83 c4 04             	add    esp,0x4
  401195:	5d                   	pop    ebp
  401196:	c3                   	ret    
  401197:	66 0f 1f 84 00 00 00 	nop    WORD PTR [eax+eax*1+0x0]
  40119e:	00 00 

004011a0 <main>:
  4011a0:	55                   	push   ebp
  4011a1:	89 e5                	mov    ebp,esp
  4011a3:	8b 45 0c             	mov    eax,DWORD PTR [ebp+0xc]
  4011a6:	8b 45 08             	mov    eax,DWORD PTR [ebp+0x8]
  4011a9:	31 c0                	xor    eax,eax
  4011ab:	5d                   	pop    ebp
  4011ac:	c3                   	ret    
  4011ad:	0f 1f 00             	nop    DWORD PTR [eax]

004011b0 <_GLOBAL__sub_I_shell.cpp>:
  4011b0:	55                   	push   ebp
  4011b1:	89 e5                	mov    ebp,esp
  4011b3:	e8 c8 ff ff ff       	call   401180 <__cxx_global_var_init>
  4011b8:	5d                   	pop    ebp
  4011b9:	c3                   	ret    
  4011ba:	cc                   	int3   
  4011bb:	cc                   	int3   
  4011bc:	cc                   	int3   
  4011bd:	cc                   	int3   
  4011be:	cc                   	int3   
  4011bf:	cc                   	int3   

004011c0 <_ZN9testClassC2Ev>:
  4011c0:	55                   	push   ebp
  4011c1:	89 e5                	mov    ebp,esp
  4011c3:	8b 45 08             	mov    eax,DWORD PTR [ebp+0x8]
  4011c6:	8b 45 08             	mov    eax,DWORD PTR [ebp+0x8]
  4011c9:	c7 00 01 00 00 00    	mov    DWORD PTR [eax],0x1
  4011cf:	5d                   	pop    ebp
  4011d0:	c3                   	ret    
  4011d1:	cc                   	int3   
  4011d2:	cc                   	int3   
  4011d3:	cc                   	int3   
  4011d4:	cc                   	int3   
  4011d5:	cc                   	int3   
  4011d6:	cc                   	int3   
  4011d7:	cc                   	int3   
  4011d8:	cc                   	int3   
  4011d9:	cc                   	int3   
  4011da:	cc                   	int3   
  4011db:	cc                   	int3   
  4011dc:	cc                   	int3   
  4011dd:	cc                   	int3   
  4011de:	cc                   	int3   
  4011df:	cc                   	int3   

004011e0 <_start>:
  4011e0:	e8 2f ef ff ff       	call   400114 <_init>
  4011e5:	6a 00                	push   0x0
  4011e7:	6a 00                	push   0x0
  4011e9:	e8 b2 ff ff ff       	call   4011a0 <main>
  4011ee:	83 c4 08             	add    esp,0x8
  4011f1:	e9 23 ef ff ff       	jmp    400119 <_fini>

Disassembly of section .init_array:

004021f8 <.init_array>:
  4021f8:	b0 11                	mov    al,0x11
  4021fa:	40                   	inc    eax
	...

Disassembly of section .bss:

004031fc <t>:
  4031fc:	00 00                	add    BYTE PTR [eax],al
	...

Disassembly of section .comment:

00000000 <.comment>:
   0:	4c                   	dec    esp
   1:	69 6e 6b 65 72 3a 20 	imul   ebp,DWORD PTR [esi+0x6b],0x203a7265
   8:	4c                   	dec    esp
   9:	4c                   	dec    esp
   a:	44                   	inc    esp
   b:	20 31                	and    BYTE PTR [ecx],dh
   d:	32 2e                	xor    ch,BYTE PTR [esi]
   f:	30 2e                	xor    BYTE PTR [esi],ch
  11:	31 00                	xor    DWORD PTR [eax],eax
  13:	63 6c 61 6e          	arpl   WORD PTR [ecx+eiz*2+0x6e],bp
  17:	67 20 76 65          	and    BYTE PTR [bp+0x65],dh
  1b:	72 73                	jb     90 <_init-0x400084>
  1d:	69 6f 6e 20 31 32 2e 	imul   ebp,DWORD PTR [edi+0x6e],0x2e323120
  24:	30 2e                	xor    BYTE PTR [esi],ch
  26:	31 00                	xor    DWORD PTR [eax],eax
	...
It generates all the code to initialize everything but never calls it from _init as expected. So whats going on? Did I configure something wrong? Has anyone actually got global constructors working with clang? Is there something magical clang is doing and my constructors will get called anyways?

Re: Getting Clang to call my global constructors

Posted: Sat Sep 25, 2021 12:35 pm
by nullplan
No, it's just that your _start() is wrong. Clang uses the .init_array (and probably .fini_array) mechanism for constructors, so you need to iterate over those in order to call your constructors. E.g. add the following to your crt0.c:

Code: Select all

typedef void initfunc_t(void);
extern initfunc_t *__init_array_start[], *__init_array_end[];

static void handle_init_array(void)
{
  for (initfunc_t **p = __init_array_start; p != __init_array_end; p++)
    (*p)();
}
For fini_array, you need the same thing, but the loop must run backwards. Anyway, you need to call that function before calling main().

Re: Getting Clang to call my global constructors

Posted: Sat Sep 25, 2021 1:59 pm
by PgrAm
Damn is that really all it takes? If so the information on the wiki is incredibly confusing and definitely needs an update, it implies this mechanism is only used on ARM, and also that __cxa_atexit is only used on Itanium.

I much prefer this mechanism of calling the constructor/destructors directly rather than crossing your fingers and hoping GCC does the right thing

Although If I'm calling the constructors myself then whats the purpose of this __cxx_global_var_init function that clang has generated?

Re: Getting Clang to call my global constructors

Posted: Sat Sep 25, 2021 3:07 pm
by nullplan
PgrAm wrote: and also that __cxa_atexit is only used on Itanium.
No, it is part of the Itanium C++ ABI, but that one is now in use pretty much everywhere. Also__cxa_atexit() is only really needed with dynamic libraries, otherwise it does the same thing as atexit(). Speaking of, your code is currently not running atexit() handlers. Have fun adding those.
PgrAm wrote:I much prefer this mechanism of calling the constructor/destructors directly rather than crossing your fingers and hoping GCC does the right thing
There is no real difference between the .init_array method and the _init method. It leads to the exact same thing and the exact same dependence on the compiler doing the right thing. No, the important difference is that the linker cannot screw up catastrophically in the case of .init_array, as it can with .init, where it can reorder input sections such that the _init symbol winds up behind the return instruction, leading to crashes on startup.
PgrAm wrote:Although If I'm calling the constructors myself then whats the purpose of this __cxx_global_var_init function that clang has generated?
It's the function referenced by the function referenced by the init_array? .init_array contains one address: 0x4011b0 (you can see it in the disassembly. I don't know why it disassembled that section, as .init_array is data, not code). That address is _GLOBAL__sub_I_shell.cpp, which is a very short function calling __cxx_global_var_init.

Re: Getting Clang to call my global constructors

Posted: Sat Sep 25, 2021 3:09 pm
by PgrAm
Thanks that was a very thorough explanation :D