Korona wrote:One reason is that the "trivial cases" only truly appear after inlining and other optimizations.
True.
Korona wrote:You do not gain anything by replacing x/0 by ud2. The source code will not contain a literal x/0 anyway. Only after inlining and constant folding (+ potentially other optimizations), such statements will appear. And it only makes sense to assume that they don't happen if you can also act on that, not only by inserting a fault, but also by propagating the information that the code path is dead out of a conditional or a loop etc.
I disagree. You gain something valuable by preventing such wrong code from spreading if you believe in the standard and consider that one day you might act differently.
Let me make an example about CPUs. Copying the text from Wikipedia:
wikipedia wrote: the AMD specification requires that the most significant 16 bits of any virtual address, bits 48 through 63, must be copies of bit 47 (in a manner akin to sign extension). If this requirement is not met, the processor will raise an exception. [...]
This feature eases later scalability to true 64-bit addressing.
That's a perfect example. While the 16 most significant bit were completely useless, AMD required them to be set in order to prevent software from setting them with arbitrary value and then de-referencing the pointer. Now that there systems with 56 bits of indexable virtual memory, we can benefit from that early limitation. Why apparently no compiler did that?
Korona wrote: it didn't make sense to tell the compiler that x/0 == __builtin_unreachable.
It would have made sense to me, just to prevent wrong code from spreading.
For unaligned access is the same story. Consider the example:
Code: Select all
int x = 1;
char *p = (char *)&x + 1;
*(int *)p = 2;
Why it took
decades to have a warning like -Wcast-align=strict on line 3? It would have been extremely useful to have such a warning to prevent the unaligned access UB, right? Nobody even considered that for something like 40 years, while it was fairly simple to implement. It requires just checking the alignment of the type of a given expression. Char pointers cannot be (safely) casted to int pointers because "char" is aligned at 1, while "int" is aligned at 4 (or something else > 1). Also, when "-Wcast-align" itself was introduced, it still didn't show ANY warnings on architectures that actually allowed unaligned access. We had to wait for a few more years to get -Wcast-align=strict. Isn't that weird?
Why is that? IMHO, nobody thought of doing anything with that form of UB. It was fine like that. There was no need to force developers to use the uglier memcpy(), while the semantic of unaligned access was well-defined for both developers and compiler engineers, no matter what the standard said about that form of UB. It feels like the standard wasn't taken so literally in the past, doesn't it? Later, new ideas for optimizations came, so it was necessary to find a way to allow a whole class of optimizations without "breaking the law" and... BOOM, hidden pearls in the standard that almost nobody cared about, started to be used as pillars on which a whole generation of optimizations will lay on. Does it make any sense?
P.S. Check this SO question from more than 6 years ago:
https://stackoverflow.com/questions/257 ... int-on-x86 Nobody even mentioned the ISO standard and UB until an update on 2020-05-10, which recommends -Wcast-align=strict to be used on x86, in order to avoid UB. It's not a hard-proof of anything, but the whole -Wcast-align story shows, as many of my other examples, developers' and compiler engineers' mindset and how it evolved.