Finding an UB
Posted: Thu Feb 14, 2019 1:50 pm
Hi,
Can you C language and undefined behaviour experts please help me and explain what the problem is with this code?When compiled with "-ansi -freestanding -fno-stack-protector -fno-builtins -Wall -Wextra -Wpedantic", I got no warnings or errors of any kind. Compilers used:
- gcc (GCC) 8.2.1 20180831
- clang version 7.0.0 (tags/RELEASE_700/final)
Compiled for x86_64 with gcc -O0 outputs: "FFFFFFFFFFFFFF70 FFFFFFFFFFFFFF7F /sys/lang/core.en" - difference 15 bytes.
Compiled for x86_64 with gcc -O2 outputs: "FFFFFFFFFFFFFF30 FFFFFFFFFFFFFF3F /sys/lang/core.en" - difference 15 bytes.
Compiled for x86_64 with Clang -O0 outputs: "FFFFFFFFFFFFFF70 FFFFFFFFFFFFFF7F /sys/lang/core.en" - difference 15 bytes.
Compiled for x86_64 with Clang -O2 outputs: "FFFFFFFFFFFFFF50 FFFFFFFFFFFFFF5F /sys/lang/core.en" - difference 15 bytes.
Compiled for AArch64 with gcc -O0 outputs: "FFFFFFFFFFFFFF78 FFFFFFFFFFFFFF87 /sys/lang/core.en" - difference 15 bytes.
Compiled for AArch64 with gcc -O2 outputs: "FFFFFFFFFFFFFF80 FFFFFFFFFFFFFF8F /sys/lang/core.en" - difference 15 bytes.
Compiled for AArch64 with Clang -O0 outputs: "FFFFFFFFFFFFFD7E FFFFFFFFFFFFFD8D /sys/lang/core.en" - difference 15 bytes.
Compiled for AArch64 with Clang -O2 outputs: "FFFFFFFFFFFFFF08 FFFFFFFFFFFFFF0F /sys/laen/core." - Ooooooops! difference is only 7!
Now if I change the storage class and place the character array in the data segment instead of the local stack like this:Then suddenly all 8 combinations generate the same good output, and calculate fn+15 correctly. I've repeated all tests with "&fn[15]", then again with explicit array dimension "char fn[26]=" too, just to be through. All 64 tests ended with the same results.
I've looked up all my C language books, but neither mentions anything even remotely suggesting there could be an undefined behaviour with character arrays, other than indexing out of bounds (which is not the case here).
So my question is, what's causing the Clang AArch64 optimizer to miscompile when the storage class is the local stack? Where is the UB?
Objdumps (for AArch64, Clang and -O2 only, I can provide the x86_64, gcc and -O0 versions too if you need them, but these dumps are long enough already):
AArch64 Clang -O2, local stack (BROKEN)
AArch64 Clang -O2, data segment (WORKS)
Thanks,
bzt
Can you C language and undefined behaviour experts please help me and explain what the problem is with this code?
Code: Select all
void lang_init()
{
char fn[]="/sys/lang/core.\0\0\0\0\0";
char *s,*e,*a;
int i=0,l,k;
memcpy(fn+15, lang, 2); <---- I would expect the lang string to be copied into fn array after the dot.
kprintf("%x %x %s\n", fn, fn+15, fn);
- gcc (GCC) 8.2.1 20180831
- clang version 7.0.0 (tags/RELEASE_700/final)
Compiled for x86_64 with gcc -O0 outputs: "FFFFFFFFFFFFFF70 FFFFFFFFFFFFFF7F /sys/lang/core.en" - difference 15 bytes.
Compiled for x86_64 with gcc -O2 outputs: "FFFFFFFFFFFFFF30 FFFFFFFFFFFFFF3F /sys/lang/core.en" - difference 15 bytes.
Compiled for x86_64 with Clang -O0 outputs: "FFFFFFFFFFFFFF70 FFFFFFFFFFFFFF7F /sys/lang/core.en" - difference 15 bytes.
Compiled for x86_64 with Clang -O2 outputs: "FFFFFFFFFFFFFF50 FFFFFFFFFFFFFF5F /sys/lang/core.en" - difference 15 bytes.
Compiled for AArch64 with gcc -O0 outputs: "FFFFFFFFFFFFFF78 FFFFFFFFFFFFFF87 /sys/lang/core.en" - difference 15 bytes.
Compiled for AArch64 with gcc -O2 outputs: "FFFFFFFFFFFFFF80 FFFFFFFFFFFFFF8F /sys/lang/core.en" - difference 15 bytes.
Compiled for AArch64 with Clang -O0 outputs: "FFFFFFFFFFFFFD7E FFFFFFFFFFFFFD8D /sys/lang/core.en" - difference 15 bytes.
Compiled for AArch64 with Clang -O2 outputs: "FFFFFFFFFFFFFF08 FFFFFFFFFFFFFF0F /sys/laen/core." - Ooooooops! difference is only 7!
Now if I change the storage class and place the character array in the data segment instead of the local stack like this:
Code: Select all
char fn[]="/sys/lang/core.\0\0\0\0\0";
void lang_init()
{
char *s,*e,*a;
int i=0,l,k;
memcpy(fn+15, lang, 2);
kprintf("%x %x %s\n", fn, fn+15, fn);
I've looked up all my C language books, but neither mentions anything even remotely suggesting there could be an undefined behaviour with character arrays, other than indexing out of bounds (which is not the case here).
So my question is, what's causing the Clang AArch64 optimizer to miscompile when the storage class is the local stack? Where is the UB?
Objdumps (for AArch64, Clang and -O2 only, I can provide the x86_64, gcc and -O0 versions too if you need them, but these dumps are long enough already):
AArch64 Clang -O2, local stack (BROKEN)
Code: Select all
ffffffffffe06b68 <lang_init>:
ffffffffffe06b68: d101c3ff sub sp, sp, #0x70
ffffffffffe06b6c: f0000008 adrp x8, ffffffffffe09000 <platform_dbgputc+0x4ec>
ffffffffffe06b70: 91096108 add x8, x8, #0x258
ffffffffffe06b74: 3cc0a100 ldur q0, [x8, #10]
ffffffffffe06b78: 3dc00101 ldr q1, [x8]
ffffffffffe06b7c: a90267fa stp x26, x25, [sp, #32]
ffffffffffe06b80: a9035ff8 stp x24, x23, [sp, #48]
ffffffffffe06b84: a90457f6 stp x22, x21, [sp, #64]
ffffffffffe06b88: a9054ff4 stp x20, x19, [sp, #80]
ffffffffffe06b8c: a9067bfd stp x29, x30, [sp, #96]
ffffffffffe06b90: 3c80a3e0 stur q0, [sp, #10]
ffffffffffe06b94: 3d8003e1 str q1, [sp]
ffffffffffe06b98: b0000074 adrp x20, ffffffffffe13000 <_binary_font_end+0xcd0>
ffffffffffe06b9c: f9428294 ldr x20, [x20, #1280]
ffffffffffe06ba0: 910003e8 mov x8, sp
ffffffffffe06ba4: b2400d13 orr x19, x8, #0xf
ffffffffffe06ba8: 321f03e2 orr w2, wzr, #0x2
ffffffffffe06bac: aa1303e0 mov x0, x19
ffffffffffe06bb0: aa1403e1 mov x1, x20
ffffffffffe06bb4: 910183fd add x29, sp, #0x60
ffffffffffe06bb8: 94000166 bl ffffffffffe07150 <memcpy>
ffffffffffe06bbc: d0000000 adrp x0, ffffffffffe08000 <mbox_call+0xe8>
ffffffffffe06bc0: 913c9000 add x0, x0, #0xf24
ffffffffffe06bc4: 910003e1 mov x1, sp
ffffffffffe06bc8: 910003e3 mov x3, sp
ffffffffffe06bcc: aa1303e2 mov x2, x19
ffffffffffe06bd0: 97fff503 bl ffffffffffe03fdc <kprintf>
Code: Select all
ffffffffffe06b68 <lang_init>:
ffffffffffe06b68: a9bb67fa stp x26, x25, [sp, #-80]!
ffffffffffe06b6c: a9015ff8 stp x24, x23, [sp, #16]
ffffffffffe06b70: a90257f6 stp x22, x21, [sp, #32]
ffffffffffe06b74: a9034ff4 stp x20, x19, [sp, #48]
ffffffffffe06b78: a9047bfd stp x29, x30, [sp, #64]
ffffffffffe06b7c: b0000074 adrp x20, ffffffffffe13000 <_binary_font_end+0xcb0>
ffffffffffe06b80: f9431694 ldr x20, [x20, #1576]
ffffffffffe06b84: b0000075 adrp x21, ffffffffffe13000 <_binary_font_end+0xcb0>
ffffffffffe06b88: f94292b5 ldr x21, [x21, #1312]
ffffffffffe06b8c: 321f03e2 orr w2, wzr, #0x2
ffffffffffe06b90: 91003e93 add x19, x20, #0xf
ffffffffffe06b94: aa1303e0 mov x0, x19
ffffffffffe06b98: aa1503e1 mov x1, x21
ffffffffffe06b9c: 910103fd add x29, sp, #0x40
ffffffffffe06ba0: 94000166 bl ffffffffffe07138 <memcpy>
ffffffffffe06ba4: d0000000 adrp x0, ffffffffffe08000 <mbox_call+0x100>
ffffffffffe06ba8: 913c3000 add x0, x0, #0xf0c
ffffffffffe06bac: aa1403e1 mov x1, x20
ffffffffffe06bb0: aa1303e2 mov x2, x19
ffffffffffe06bb4: aa1403e3 mov x3, x20
ffffffffffe06bb8: 97fff509 bl ffffffffffe03fdc <kprintf>
bzt