Korona wrote:That's just not true. Compilers that exploited UB existed in the 90s (and to some extend in the 80s, see the paper of optimizations in Andrew Tanenbaum's compiler kit that I linked earlier).
I still have to look at that.
Korona wrote:GCC certainly took advantage of UB in the mid 90s, before C99.
In my previous answer to Solar, I prepared a fair challenge. Please, take a look at it. If you show me a '90s compiler that treats as UB one of the first two points (unaligned access, type punning with casts), OR all the other three below, I will agree with you.
Korona wrote:vvaltchev wrote:But, as nullplan pointed out, generally compilers didn't assume that UB will never happen, as they do today
That's something we can agree on, but you got the reasoning wrong: the reason for this change is that compilers were generally less sophisticated, and not that the understand of what "undefined behavior" means somehow changed fundamentally.
That
looks correct, but I have good reasons to doubt it. Is "x/0" so hard to catch? I don't think so. Why GCC started to disallow this trivial kind of UB only with version 7.x? Also, there's no need for the understanding of UB to have
fundamentally changed: it's enough to change a little bit the wording in the standard for it, in order to have an significant effect. In other words, a
slight shift in the interpretation makes a big difference.
Korona wrote:The optimization techniques that compilers use to exploit UB existed in the 80s (which is generally just dead code elimination and strength reduction, really)
I believe that dead code elimination and strength reduction do not require UB to work. "if (3 > 4) { ... }" can be easily proven as dead code. Also "if (INT_MAX + 1 > 0) { ... }" can be optimized without messing with UB. The compiler, unless it's a cross-compiler, just needs to represent literal integers with the same type used by the C code. So, INT_MAX+1 will naturally end up being < 0 in the compiler (on architectures that behave this way), so the code will optimized away. In order to INT_MAX+1 > 0 to be always true (modern assumption), it requires the compiler to do more work using bignums OR to realize that the expression INT_MAX+1 overflows the integer, so explicitly claim that's UB. For cross-compilers, things are different. I agree that even 30 years ago, if we tried to cross-compile code like that, we
might have had some problems, like that the dead code optimization isn't performed or the compiler assumes that INT_MAX+1 is still > 0 because that's true on the host architecture, while on the target architecture it's not. In theory, compilers should use very portable code that keeps in mind the target architecture when evaluating such expressions, so it's possible that even in that case, everything worked as expected. I have no personal experience with cross-compiling software for very different architectures more than 20 years ago. So, no strong opinion here.
Korona wrote: but *no single compiler implemented all techniques at the same time* so it was less visible.
I'd be curious if you guys can find a '90s compiler that treats as UB even one of the UB types I've mentioned.
Korona wrote:For example, the dragon book mentions lots of these techniques, and it's from 1987. The techniques themselves were invented in the 70s.
Unfortunately, I haven't read the "dragon book" (yet) but I believe that most optimizations techniques have nothing to do with UB. The assumption of UB allows compilers to use the same techniques in more cases. Maybe that's not true for aliasing assumptions, I'm not sure.
Again, I'm not a compiler expert, but it's obvious to me that the modern assumption of UB opened the door for many more optimizations opportunities. And, it's
definitively possible that some UB types weren't treated as UB because of the lack of optimizations that took advantage of that, but in some other cases, it looks to me that compilers didn't
want to treat as UB some expressions. It lacked the intention, not the technology in some cases. Other cases like "i++ + ++i" have been UB from C's day 1, because it didn't feel right to force some order to evaluation in such cases. But what happened in those UB cases? Well, the whole program wasn't considered invalid because of that. Just, the expression didn't have a defined value.