Page 1 of 1

[SOLVED] Navigating the compiler optimizer when checking MMIO

Posted: Tue Jul 02, 2024 11:06 am
by austanss
For context, I've been working on an AHCI driver for my operating system. I've been agonizing over reference implementations and the official specification, and I finally wrote a functional read command. The problem is though, is that originally it did not work when compile-time optimization, of any degree, was enabled.

This was the area that was being mis-optimized, and it led to a perpetual hang (presumably inside the loop).

Code: Select all

#define HBA_STATUS_TFES     (1 << 30)
    drive->port_hba->command_issue = 1;

    while (true) {
        if (drive->port_hba->interrupt_status & HBA_STATUS_TFES) {
            return false;
        }
        if (drive->port_hba->command_issue == 0) { // this was being optimized out
            break;
        }
    }

    return true;
A little bit of work with a debugger led me to derive that the check against "command_issue" was being optimized out, which led to the code being stuck inside the loop in perpetuity.

When keeping the optimizer's typical behavior in mind, it's easy to see why. As "command_issue" was set to 1 only a few lines prior, the compiler (mistakenly) assumed that "command_issue" would never be 0, and optimized out the conditional statement for performance purposes.

My solution was to simply mark the "port_hba" struct as volatile. It worked, and I went on with my day.

Is this the most optimal solution here? What is the best practice for dealing with optimization shenanigans surrounding conditional checks against MMIO values that are modified by the corresponding device? I would love to be enlightened here...

Re: Navigating the compiler optimizer when checking MMIO

Posted: Tue Jul 02, 2024 11:29 am
by nullplan
This is the way the C standard wants you to solve this problem. MMIO is actually the original intended use for volatile.

If you are dissatisfied with volatile, the other tried and true method is to use assembler to force the compiler to emit the reads and writes. E.g.

Code: Select all

static inline uint8_t read_u8(uint8_t *p) {
  uint8_t r;
  __asm__ volatile ("movb %1, %0" : "=r"(r) : "m"(*p));
  return r;
}
Repeat that for all possible sizes, add the corresponding "write" versions, and use those functions exclusively to access MMIO. This is what Linux does, though it also has a checker to test that it never accesses MMIO any other way.

Re: Navigating the compiler optimizer when checking MMIO

Posted: Tue Jul 02, 2024 11:44 am
by austanss
nullplan wrote: Tue Jul 02, 2024 11:29 am This is the way the C standard wants you to solve this problem. MMIO is actually the original intended use for volatile.

If you are dissatisfied with volatile, the other tried and true method is to use assembler to force the compiler to emit the reads and writes.
Well I'm glad I'm solving this problem the standard-intended way. I'm definitely not dissatisfied with volatile, I was just curious what the alternatives were, and if they were any more effective or official. Thanks for the (useful but I won't use it myself) alternative!