Hunting UBs game, level 2

bzt · Post by **bzt** » Thu Feb 21, 2019 8:45 am

Dear OSDevers,

I have another interesting optimization issue, and if that's caused by an UB, I failed to see where and what.
The code is extremely simple, it's in my RAM allocator, when I clear the page(s) before I return them to the caller. Just a note, both i and pages are defined as uint64_t in the local stack, and they are 8 bytes aligned.

Code: Select all

for (i=0; i<pages; i++) {
  vmm_page(tmpmap1, base+i*__PAGESIZE, PG_CORE_RWNOCACHE);
  memset((void*)tmpmap1, 0, __PAGESIZE);
}

Both vmm_map() and memset() are well tested, and they work fine. Just to be complete, here's the source for the former (both virt_t and phys_t are aliases to uint64_t, I use those to make the source more readable), the latter is written in assembly (hence not influenced by optimizations):

Code: Select all

void vmm_page(virt_t virt, phys_t phys, uint16_t access) {
  virt_t *ptr;
  switch(virt) {
    ...
    case tmpmap1: ptr = tmpmap1ptr; break;
    ...
  }
  *ptr = phys | access;
  vmm_invalidate(virt);   /* this is __asm__ __volatile__ ("invplg (%0)" : : "r"(virt) ); on x86 */
}

Here are the test results (-ansi -O2 -Wall -Wextra -Wpedantic -fno-delete-null-pointer-checks -fno-stack-protector -mno-red-zone):
- gcc aarch64 mtune=cortex-a53 OK
- clang aarch64 mtune=cortex-a53 OK
- gcc x86_64 mtune=pentium4 OK
- clang x86_64 mtune=pentium4 OK
- gcc x86_64 mtune=core2 BROKEN
- clang x86_64 mtune=core2 OK

If I add these:

Code: Select all

  kprintf("i=%d pages=%d", i, pages);    /* goes after "for(i=0;...") */
  kprintf("vmm_page(%x,%x)\n", virt, phys);   /* goes before "switch(virt) {" */

then what I see on the console with the gcc core2 optimized code, is this:

Code: Select all

i=0 pages=1
vmm_page(FFFFFFFF00006000, 3C1000)
i=-429492720 pages=1
vmm_page(FFFFFFFF00006000, FFF000063C1000)
...page fault naturally...

With debugging, I can confirm that the virtual address of tmpmap1 is correct, and the page is mapped properly there, so the memset does not mess up things on the first iteration for sure.

Clearly i changed more than it should. Now the most interesting part, if I declare i as

Code: Select all

volatile uint64_t i;

then everything works as expected.
So, where is the UB?

Btw, I've used gcc 7.3.0 and clang 7.0.1. Right now I'm compiling the latest gcc 8.2.0 and I will repeat the test with that too.

EDIT: I've just tried (p is another uint64_t local variable)

Code: Select all

p = base;
for (i=0; i<pages; i++) {
  vmm_page(tmpmap1, p, PG_CORE_RWNOCACHE);
  memset((void*)tmpmap1, 0, __PAGESIZE);
  p += __PAGESIZE;
}

And it runs perfectly without the volatile keyword on i even when optimized for core2.

EDIT2: I take it back. If I remove the kprintf() from the loop, the error cames back, so back to square one. With "volatile i", it works either with or without kprintf and with multiplication or with addition.

Bests,
bzt

Solar · Post by **Solar** » Thu Feb 21, 2019 9:51 am

Still showing snippets instead of a MCVE.

Still assuming "optimization issue" instead of asking "where did I make a mistake".

Here are the test results (-ansi -O2 -Wall -Wextra -Wpedantic -fno-delete-null-pointer-checks -fno-stack-protector -mno-red-zone):
- gcc aarch64 mtune=cortex-a53 OK
- clang aarch64 mtune=cortex-a53 OK
- gcc x86_64 mtune=pentium4 OK
- clang x86_64 mtune=pentium4 OK
- gcc x86_64 mtune=core2 BROKEN
- clang x86_64 mtune=core2 OK

I see no test anywhere? Do you mean "it compiles without warnings on these platforms", or... what?

Code: Select all

  kprintf("i=%d pages=%d", i, pages);    /* goes after "for(i=0;...") */

Again, assuming that your kprintf() is functionally identical to standard printf(), and given that i and pages are uint64_t, that should read (after including <stdint.h>):

Code: Select all

  kprintf("i=%" PRTu64 " pages=%" PRTu64, i, pages);    /* goes after "for(i=0;...") */

Clearly i changed more than it should. Now the most interesting part, if I declare i as
Code: Select all
volatile uint64_t i;
then everything works as expected.
So, where is the UB?

Perhaps in a piece of code you didn't deem important, and didn't post... I don't know, I can't verify either way.

If I add these:

As I refuse to guess where, in your code, all these identifiers might be visible, or where in the code flow you actually put those statements, what all those other constants might be set to, where else you might be overflowing value ranges etc. etc. etc., I also refuse to guesstimate what might be happening here.

What I can say is that your insistence to call this "optimization issues" doesn't bode well for your chances to "get" what the actual problem is, here.

Antti · Post by **Antti** » Fri Feb 22, 2019 1:42 am

bzt wrote:and memset() are well tested

Is your memset return value correct (including case "size zero")? Does it modify registers that should be preserved, e.g. register rbx? Please check other assembly functions too if that was the case.

alexfru · Post by **alexfru** » Fri Feb 22, 2019 6:29 am

bzt wrote:

Code: Select all

  vmm_invalidate(virt);   /* this is __asm__ __volatile__ ("invplg (%0)" : : "r"(virt) ); on x86 */

Linux also lists "memory" as clobbered.

Without seeing all of the code or its disassembly it's hard to tell where your bug is. And you're probably in best position to debug your own code (I doubt there are many qualified takers here actually willing to do it for you).
Not to point fingers, but last time you had a bug in a place completely unrelated to where you thought it must have been. Could be similar this time as well.

bzt · Post by **bzt** » Fri Feb 22, 2019 2:23 pm

Solar wrote:
bzt wrote:I failed to see where and what.
instead of asking "where did I make a mistake".

No comment.

Solar wrote:Do you mean "it compiles without warnings on these platforms", or... what?

Obviously. If it wasn't clear for you, compilation goes without any warnings, and I got a page fault in run-time as I wrote. I wouldn't ask about a compilation error, you know.

Solar wrote:I don't know, I can't verify either way.

No offense, but if you can't help, why did you even bother to write a post? Just askin'. Let's just leave this be, please. Peace!

Antti wrote:Is your memset return value correct (including case "size zero")? Does it modify registers that should be preserved, e.g. register rbx? Please check other assembly functions too if that was the case.

Thank you, that's what I was doing. So far nothing, but my bet would be a clobbered rbx too, as only the core2 optimized code uses rbx relative instructions for i. And yes, my libc functions are more bullet-proof than GNU glibc, they won't misbehave if you pass NULL pointers or zero length to any of mem* or str* functions. Anyway, tonight I'll have the time for a through debugging, and we'll see.

alexfru wrote:And you're probably in best position to debug your own code (I doubt there are many qualified takers here actually willing to do it for you).
Not to point fingers, but last time you had a bug in a place completely unrelated to where you thought it must have been.

I don't expect anybody else to do the debugging for me. I just asked this because maybe somebody had a similar issue. Also last time I did not asked about my code at all, but it looks like nobody gets that (not that optimization magnifies undefined behaviour ONLY IN MY CODE, and not in anybody else's). But don't you worry, no offense taken, I get used to pointing fingers, that's what those CoS gangstalkers do all the time you know, so after several years, I'm completely immune

Clobbered memory for invplg was a good idea, even if it's unrelated to this bug, thanks.

Cheers,
bzt

Solar · Post by **Solar** » Sat Feb 23, 2019 7:54 am

bzt wrote:
Solar wrote:I don't know, I can't verify either way.
No offense, but if you can't help, why did you even bother to write a post? Just askin'. Let's just leave this be, please. Peace!

I am trying to help you: with your understanding of what compiler warnings are for; with how to do preliminary debugging before posting; with how to ask questions in a way that gives strong answers.

bzt wrote:I don't expect anybody else to do the debugging for me.

Then please do the part of the debugging that's firmly in your side of the ballpark, and post "why this code doesn't work" questions including a MCVE.

OSDev.org

Hunting UBs game, level 2

Hunting UBs game, level 2

Re: Hunting UBs game, level 2

Re: Hunting UBs game, level 2

Re: Hunting UBs game, level 2

Re: Hunting UBs game, level 2

Re: Hunting UBs game, level 2