[SOLVED] Navigating the compiler optimizer when checking MMIO
Posted: Tue Jul 02, 2024 11:06 am
For context, I've been working on an AHCI driver for my operating system. I've been agonizing over reference implementations and the official specification, and I finally wrote a functional read command. The problem is though, is that originally it did not work when compile-time optimization, of any degree, was enabled.
This was the area that was being mis-optimized, and it led to a perpetual hang (presumably inside the loop).
A little bit of work with a debugger led me to derive that the check against "command_issue" was being optimized out, which led to the code being stuck inside the loop in perpetuity.
When keeping the optimizer's typical behavior in mind, it's easy to see why. As "command_issue" was set to 1 only a few lines prior, the compiler (mistakenly) assumed that "command_issue" would never be 0, and optimized out the conditional statement for performance purposes.
My solution was to simply mark the "port_hba" struct as volatile. It worked, and I went on with my day.
Is this the most optimal solution here? What is the best practice for dealing with optimization shenanigans surrounding conditional checks against MMIO values that are modified by the corresponding device? I would love to be enlightened here...
This was the area that was being mis-optimized, and it led to a perpetual hang (presumably inside the loop).
Code: Select all
#define HBA_STATUS_TFES (1 << 30)
drive->port_hba->command_issue = 1;
while (true) {
if (drive->port_hba->interrupt_status & HBA_STATUS_TFES) {
return false;
}
if (drive->port_hba->command_issue == 0) { // this was being optimized out
break;
}
}
return true;
When keeping the optimizer's typical behavior in mind, it's easy to see why. As "command_issue" was set to 1 only a few lines prior, the compiler (mistakenly) assumed that "command_issue" would never be 0, and optimized out the conditional statement for performance purposes.
My solution was to simply mark the "port_hba" struct as volatile. It worked, and I went on with my day.
Is this the most optimal solution here? What is the best practice for dealing with optimization shenanigans surrounding conditional checks against MMIO values that are modified by the corresponding device? I would love to be enlightened here...