Page 2 of 2

Re: QEMU hangs on M2 MacBook running Ventura

Posted: Sun Jun 18, 2023 7:35 pm
by prajwal
I am not certain based on below code if the call to usub64_borrow() is inlined or not - but seems like it is. Maybe you can help confirm on that. FYI - I have marked the function as non-static and non-inline in fpu/softfloat.c. In the same file, there are calls to this function and this is how disassembly looks like in Good and Bad case

Good case

Code: Select all

;     a0 = a0 >> c;
   188c8: 6b 25 ca 9a   lsr x11, x11, x10
;     a->frac_lo = a1 | (sticky != 0);
   188cc: 1f 01 00 f1   cmp x8, #0
   188d0: e8 07 9f 1a   cset  w8, ne
   188d4: 28 01 08 aa   orr x8, x9, x8
   188d8: 2b a0 00 a9   stp x11, x8, [x1, #8]
;     r->frac_lo = usub64_borrow(a->frac_lo, b->frac_lo, &c);
   188dc: 09 08 40 f9   ldr x9, [x0, #16]
;     volatile unsigned long long b = *pborrow;
   188e0: ff 07 00 f9   str xzr, [sp, #8]
;     x = __builtin_subcll(x, y, b, (uint64_t*)&b);
   188e4: ea 07 40 f9   ldr x10, [sp, #8]
   188e8: 28 01 08 eb   subs  x8, x9, x8
   188ec: e9 27 9f 1a   cset  w9, lo
   188f0: 08 01 0a eb   subs  x8, x8, x10
   188f4: ea 27 9f 1a   cset  w10, lo
   188f8: 29 01 0a 2a   orr w9, w9, w10
   188fc: e9 07 00 f9   str x9, [sp, #8]
;     *pborrow = b & 1;
   18900: e9 07 40 f9   ldr x9, [sp, #8]
   18904: 29 01 40 92   and x9, x9, #0x1
;     r->frac_lo = usub64_borrow(a->frac_lo, b->frac_lo, &c);
   18908: 08 08 00 f9   str x8, [x0, #16]
;     r->frac_hi = usub64_borrow(a->frac_hi, b->frac_hi, &c);
   1890c: 0a 04 40 f9   ldr x10, [x0, #8]
   18910: 2b 04 40 f9   ldr x11, [x1, #8]
;     volatile unsigned long long b = *pborrow;
   18914: e9 07 00 f9   str x9, [sp, #8]
;     x = __builtin_subcll(x, y, b, (uint64_t*)&b);
   18918: e9 07 40 f9   ldr x9, [sp, #8]
   1891c: 4a 01 0b eb   subs  x10, x10, x11
   18920: eb 27 9f 1a   cset  w11, lo
   18924: 49 01 09 eb   subs  x9, x10, x9
   18928: ea 27 9f 1a   cset  w10, lo
   1892c: 6a 01 0a 2a   orr w10, w11, w10
   18930: ea 07 00 f9   str x10, [sp, #8]
   18934: ea 07 40 f9   ldr x10, [sp, #8]
   18938: 09 04 00 f9   str x9, [x0, #8]
;     if (a->frac_hi) {
Bad case

Code: Select all

;     a0 = a0 >> c;
   17ebc: 6b 25 ca 9a   lsr x11, x11, x10
;     a->frac_lo = a1 | (sticky != 0);
   17ec0: 1f 01 00 f1   cmp x8, #0
   17ec4: e8 07 9f 1a   cset  w8, ne
   17ec8: 28 01 08 aa   orr x8, x9, x8
   17ecc: 2b a0 00 a9   stp x11, x8, [x1, #8]
;     r->frac_lo = usub64_borrow(a->frac_lo, b->frac_lo, &c);
   17ed0: 0a a4 40 a9   ldp x10, x9, [x0, #8]
;     x = __builtin_subcll(x, y, b, &b);
   17ed4: 28 01 08 eb   subs  x8, x9, x8
   17ed8: e9 27 9f 1a   cset  w9, lo
;     r->frac_lo = usub64_borrow(a->frac_lo, b->frac_lo, &c);
   17edc: 08 08 00 f9   str x8, [x0, #16]
;     r->frac_hi = usub64_borrow(a->frac_hi, b->frac_hi, &c);
   17ee0: 2b 04 40 f9   ldr x11, [x1, #8]
;     x = __builtin_subcll(x, y, b, &b);
   17ee4: 4a 01 0b cb   sub x10, x10, x11
   17ee8: 49 01 09 cb   sub x9, x10, x9
   17eec: 09 04 00 f9   str x9, [x0, #8]
;     if (a->frac_hi) {

Re: QEMU hangs on M2 MacBook running Ventura

Posted: Mon Jun 19, 2023 1:21 am
by Octocontrabass
prajwal wrote:I am not certain based on below code if the call to usub64_borrow() is inlined or not - but seems like it is. Maybe you can help confirm on that.
It has been inlined. It looks like the caller has also been inlined.
prajwal wrote:FYI - I have marked the function as non-static and non-inline in fpu/softfloat.c.
That doesn't prevent compilers from inlining it anyway. Try "__attribute__((noinline))" to prevent inlining.
prajwal wrote:In the same file, there are calls to this function and this is how disassembly looks like in Good and Bad case
I see no functional differences between the "good" and "bad" cases here either.