C and undefined behavior

alexfru · Post by **alexfru** » Sun Jun 13, 2021 12:37 pm

vvaltchev wrote:It appears that you don't agree with your own comments. Maybe you copy-pasted the wrong text?

I quoted section 3.3 of ANSI C in response to your statement

vvaltchev wrote:I'm not talking about aliasing limitations which really were introduced in C99 with "restrict" etc.

which I interpret (specifically, the parts "aliasing limitations" and "really were introduced") as aliasing rules/restrictions/limitations appearing in C99, which is wrong because they already were in C89.

vvaltchev · Post by **vvaltchev** » Sun Jun 13, 2021 1:33 pm

alexfru wrote:which I interpret (specifically, the parts "aliasing limitations" and "really were introduced") as aliasing rules/restrictions/limitations appearing in C99, which is wrong because they already were in C89.

First of all, restrict was introduced in C99:
https://en.wikipedia.org/wiki/Restrict
There's no point in further discussing this fact.

But, thing is: no matter if C89 mentioned aliasing or not, I do not understand, why do you said:

alexfru wrote:No restrict yet, no explicit use of the word alias either.

Just a few hours ago. It doesn't make sense. Please, double check the conversation.
Either you believe that C89 already mentioned those words and you quote the text from C89 that does or you agree with me because it doesn't. You quoted a text that agrees with what I said and you remarked that with:

alexfru wrote:No restrict yet, no explicit use of the word alias either.

Which looks like you agree with me, while saying I'm wrong. Man, please double check the conversation. There must be some simple misunderstanding going on here.

Korona · Post by **Korona** » Sun Jun 13, 2021 2:13 pm

alexfru is right here: the part of the standard that he quoted describes the aliasing rules (without using the word "alias" explicitly, but by stating that accessing an object through an unrelated type is illegal.).

vvaltchev · Post by **vvaltchev** » Sun Jun 13, 2021 2:31 pm

Korona wrote:
vvaltchev wrote:Given the evidence in the legacy C source code, it looks like most of the people didn't write UB-safe code.
Do you have any evidence for the quote above?

Yes, I do.

1. According to this 2013 LWN article: https://lwn.net/Articles/575563/

LWN wrote: Andrew McGlashan raised the issue on the debian-security mailing list, expressing some surprise that the topic hadn't already come up. The paper specifically cites tests done on the Debian "Wheezy" (7.0) package repository, which found that 40% of 8500+ C/C++ packages have "optimization-unstable code" (or just "unstable code"). That does not mean that all of those are vulnerabilities, necessarily, but they are uses of undefined behavior—bugs, for the most part.

2. Type punning with casts (UB) has been traditionally used all the time in UNIX software and documentation. Consider the Berkeley sockets interface, which is mentioned in the wikipedia article about type punning: https://en.wikipedia.org/wiki/Type_punning

Wikipedia wrote: One classic example of type punning is found in the Berkeley sockets interface. The function to bind an opened but uninitialized socket to an IP address is declared as follows:
Code: Select all
 int bind(int sockfd, struct sockaddr *my_addr, socklen_t addrlen); 
The bind function is usually called as follows:
Code: Select all
struct sockaddr_in sa = {0};
int sockfd = ...;
sa.sin_family = AF_INET;
sa.sin_port = htons(port);
bind(sockfd, (struct sockaddr *)&sa, sizeof sa);

Now, type punning with casts is super dangerous UB and kind-of acceptable with unions in C (unspecified behavior), while in C++ is still UB. The new way to do it is by using memcpy() (or __builtin_memcpy()) and completely rely on the compiler to optimize. Again, we have to tell the compiler to COPY the stuff and rely of on the fact it won't do an actual copy. As you can see, the paradigm "do what I say" does not exist anymore in C, while it existed in the past for a very long time.

3. As I've shown here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93031#c5 the Linux kernel still uses TODAY unaligned access conditionally depending on the target. Unaligned access is UB.

4. Webkit in 2008 used unaligned access:
https://bugs.webkit.org/show_bug.cgi?id=20990
Obviously that started to be noticed after -Wcast-align has been introduced and enabled on some builds.
Also obviously, that caused portability issues on Sparc: https://bugs.webkit.org/show_bug.cgi?id=19775

5. As mentioned here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93031#c3
A high-performance compression library called LZO used unaligned access conditionally, depending on the architecture. That was a "good practice at the time". Maybe __builtin_memcpy() didn't exist back then, not sure. I'm only sure that I've seen #ifdefs like that over and over again.

6. Apparently Linux kernel's linked list implementation, which exists since the early '90s, contains UB:
https://stackoverflow.com/questions/648 ... n-cause-ub

7. A GNOME extension is full of UB: https://gitlab.gnome.org/GNOME/babl/-/issues/1

8. In the '90s, John Carmack used a very cleaver trick in Quake III Arena to compute the inverse square root of a float number.
It used type punning with casts, because it was the typical way to do such things, at the time.
In the wikipedia page: https://en.wikipedia.org/wiki/Fast_inverse_square_root there are also comments about the UB in the code and how to make it acceptable by modern compilers.
If, at the time UB was such a problem like it is today, a talented developer like Carmack would never have written such code risking brutally not only the portability of the game, but that it might break after a re-build. Today, most of us know that such code contains UB and really avoid writing it. Isn't that a significant change?

------------------------------------------------------------------------------------------------------------------------------------------------------
In conclusion, most of the UB in legacy software is about unaligned access, which is obviously non-portable. At the second place, pointer aliasing. At the 3rd place, I believe there is UB caused by assumptions about integer overflow and other types of UB.

I haven't found examples of UB like "i++ + ++i" in legacy code that looked correct at the time, because it was clear even back then that such code was non-sense. In other words, part of what is considered UB (and so completely unacceptable) today, was considered simply non-portable at the time, of course while still using the wording “undefined behavior”. I think that the wikipedia quote I mentioned before summed up the whole theory pretty well.

-------------------
EDIT[1]: added a new point about type-punning.
EDIT[2]: added the example of the fast inverse square root.

vvaltchev · Post by **vvaltchev** » Mon Jun 14, 2021 10:33 am

Korona wrote:alexfru is right here: the part of the standard that he quoted describes the aliasing rules (without using the word "alias" explicitly, but by stating that accessing an object through an unrelated type is illegal.).

Well, it wasn't clear at all that he was stating that.
Anyway, I'm no language lawyer, but since C99 needed changes in that area to allow compilers to introduce the strict aliasing, it follows that C89 wasn't strict enough on that or somehow it missed some details.

Korona · Post by **Korona** » Mon Jun 14, 2021 1:51 pm

What you listed (Linux, GNOME etc.) is not really legacy software that became broken due to the introduction of UB.

Type punning with unions is defined behavior (as a GCC extension). The inverse square root example has a comment "// evil floating point bit level hacking" on the questionable line, so it seems that the author was fully aware that it was UB and fragile. And the Linux kernel does not really care about the C standard anyway and passes a ton of flags to work around that (flags that disable some optimizations, -fwrapv, -fno-strict-aliasing). The Linux kernel is written in "whatever language makes GCC emit working code" and not in ANSI C (or K&R C), and the core developers are open to admit that. The moment GCC breaks their unaligned access code, they will add a flag to disable that optimization.

That leaves unaligned access as the only somewhat convincing example. I think the reason that unaligned access is often found in the wild is that compilers traditionally cannot do a lot of optimizations based on aligned access, but this is changing as auto vectorizers become better and more specialized instructions become available. Unaligned access is also one of the types of UB that is trivial to detect by UBSAN (in contrast to data races, among others).

Solar · Post by **Solar** » Mon Jun 14, 2021 3:00 pm

vvaltchev wrote:First of all, restrict was introduced in C99:
https://en.wikipedia.org/wiki/Restrict
There's no point in further discussing this fact.

But it was discussed already much earlier (noalias).

For reference, here is the 1988 ANSI C draft proposal, which indeed was poorly conceived and didn't make it into C89. That is not because the idea was rejected, just that there were too many inconsistencies which had to be resolved first. In this 1994 draft proposal on "restrict" there is some reasoning involved. I quote, emphasis mine:

Draft Proposal wrote:[This convention is] also consistent with the notions of objects and argument association in Fortran 77. Thus, in environments that support mixed-language programming, restricted pointer parameters can be used in C prototypes for functions defined in Fortran, and also in definitions of C functions that are called from Fortran.

Fortran 77.

(That draft also has an "Appendix: Comparison with noalias", which goes into some details of how "restrict" differs from "noalias". From this it should be obvious that "restrict" is conceived as a "better noalias", not a completely novel idea.)

vvaltchev · Post by **vvaltchev** » Mon Jun 14, 2021 4:02 pm

Korona wrote:What you listed (Linux, GNOME etc.) is not really legacy software that became broken due to the introduction of UB.

So, you're ignoring the 2013 LWN article:

LWN wrote:40% of 8500+ C/C++ packages have "optimization-unstable code"

And the fact that type punning with casts was THE DOCUMENTED way to use Berkley sockets?
Also, while the Linux kernel never cared about relaying on GCC extensions, certainly they did not start in the '90s with -fwrapv, -fno-strict-aliasing. The Linux kernel broke multiple times because of UB and that caused security bugs as well: https://lwn.net/Articles/342330/. That's why it started to build with -fwrapv, -fno-delete-null-pointer-check, -fno-strict-aliasing and other options. In my previous reply I was supposed to show that A TON of code broke cause of UB and I believe I've done that.

There's also a research paper about that: https://srg.doc.ic.ac.uk/440h/papers/stack.pdf
It shows the presence of UB in plenty of projects like: Binutils, e2fsprogs, FFmpeg+Libav, FreeType, GRUB, HiStar, Kerberos, libX11, libarchive, libgcrypt, Linux kernel, Mozilla, OpenAFS, plan9port, Postgres, Python, QEMU, Ruby+Rubinius, Sane, uClibc, VLC, Xen, Xpdf.

Is still all of that not enough?

Korona wrote:The moment GCC breaks their unaligned access code, they will add a flag to disable that optimization.

Such an option does not exist at the moment as I've shown in the WONTFIX GCC bug. So, it will be interesting to see what will happen in that case. I've tried opening a conversation about that on LKML, but they didn't seem to care enough.

Korona wrote:That leaves unaligned access as the only somewhat convincing example.

I have no problem with you believing that narrative. I honestly believe that I've shown plenty of evidence that a ton of well-established code, examples and documentation broke because of UB, over the years. No point in further insisting repeating the same things. Let's agree to disagree.

Korona wrote:I think the reason that unaligned access is often found in the wild is that compilers traditionally cannot do a lot of optimizations based on aligned access

That's true, but it's not the only reason. Unaligned access is done using very natural expressions in C. What is unnatural is using memcpy() for that. Again, no matter what the standard technically says it's OK to do. Let's forget for a moment what is supposed to be "right" and what is supposed to be "wrong" and just observe what developers did: until compilers allowed something, they took advantage of it.

It's the same for type punning with casts: it's pretty natural to do that in C and people have been doing it for decades, before and after the ANSI C came out. People stopped doing that in the last decade, after the strict aliasing rule has been enforced. Type-punning with unions existed and it was considered the "super-safe" and "super-portable" way of doing type punning, while using casts was the mainstream practice. Today, type punning with unions is barely acceptable in C and it's possible that will be made completely UB in some future. memcpy() (or __builtin_memcpy) will remain the only safe option, as already happened in C++. We will have to tell the compiler to COPY some data in a local variable, modify it and then the compiler to COPY it back, relying that it will WON'T do any of that in the emitted code. That means we gave up the ability to tell the machine directly what to do (the old "do what I say" paradigm), but we've gained better portability and more powerful optimizations. I hope you can agree at least on that. It's not all necessarily bad, I NEVER said that. Simply, it's not what it used to be. For most of the code, that's fine, but for some low-level code that need to do type-punning and stuff like that, it would mean writing more verbose and less expressive code.

Korona wrote:Unaligned access is also one of the types of UB that is trivial to detect by UBSAN (in contrast to data races, among others).

At runtime, many types of UB are trivial to detect. The problem is exploring all the code paths at runtime.

vvaltchev · Post by **vvaltchev** » Mon Jun 14, 2021 4:10 pm

Solar wrote:
vvaltchev wrote:First of all, restrict was introduced in C99:
https://en.wikipedia.org/wiki/Restrict
There's no point in further discussing this fact.
But it was discussed already much earlier (noalias).

For reference, here is the 1988 ANSI C draft proposal, which indeed was poorly conceived and didn't make it into C89. That is not because the idea was rejected, just that there were too many inconsistencies which had to be resolved first. In this 1994 draft proposal on "restrict" there is some reasoning involved. I quote, emphasis mine:

Draft Proposal wrote:[This convention is] also consistent with the notions of objects and argument association in Fortran 77. Thus, in environments that support mixed-language programming, restricted pointer parameters can be used in C prototypes for functions defined in Fortran, and also in definitions of C functions that are called from Fortran.
Fortran 77.

(That draft also has an "Appendix: Comparison with noalias", which goes into some details of how "restrict" differs from "noalias". From this it should be obvious that "restrict" is conceived as a "better noalias", not a completely novel idea.)

Yep, I totally agree that "noalias" was considered back in 1988 and that "restrict" is conceived as a "better noalias". The whole essay from Dennis Ritchie that I've mentioned in the very beginning was about how much he didn't want "noalias" in the standard because it was, in his eyes, an "abomination".

Now, can you at least consider that such notions about aliasing from Fortran didn't work well with that really was C at the time? There was a significant opposition about that and that's why in C89 we didn't have "strict aliasing". With time, things changed. In particular with C99. It makes sense, doesn’t it?

Korona · Post by **Korona** » Mon Jun 14, 2021 5:21 pm

Well, the question that we should discuss in this thread is not: are programs frequently broken by UB? We can agree on that and you provided ample evidence. However what you really need to provide evidence for is the claim that this situation fundamentally changed with more recent releases of the C standard. I claim that people also tried to avoid UB in the 80s and 90s and that the interpretation "once your program contains UB, anything can happen" is not new. Your list of programs containing UB is not evidence for that claim, in fact, most of the programs that you listed did not even exist when C99 was developed.

Solar · Post by **Solar** » Mon Jun 14, 2021 6:03 pm

It wasn't the idea per se that Ritchie opposed, it was the way it was implemented. And that is only tangential to the topic.

I second Korona. UB existed, UB broke programs (with or without "help" by the compiler), and such programs were considered broken before C89, not to speak of C99. Points in case being made via Fortran 77, Jargon File, and other quite dated references.

Unfortunately, it seems you have bitten down hard on this one and refuse to budge even one little bit. Perhaps because you want to keep your pet peeve, perhaps because you are truly convinced you're a man on a mission here. But it makes it a bit tedious arguing the point with you.

Please state, if you can, what kind of "proof" would make you consent that this "exploitation of UB" is not something introduced by C99, C89, or even plain K&R C? Note that your claim that this is actually the case is tenuous at best, and not really accepted by all here, so it shouldn't take a sworn statement by Mr. Ritchie himself or something like that...

alexfru · Post by **alexfru** » Mon Jun 14, 2021 9:47 pm

vvaltchev wrote:Unaligned access is done using very natural expressions in C. What is unnatural is using memcpy() for that.

You are entitled to your opinions. And I'm partially sympathetic because it's really a painful area of the language. But regardless of the opinions and feelings, the language standard says what should work correctly and what isn't guaranteed to. And that's the common denominator to live by. Unless your compiler can offer you a helping hand in defining the undefined or ignoring certain defects of your code.

vvaltchev wrote:Again, no matter what the standard technically says it's OK to do.

Like I just said, we can be emotional about it, but the standard and its implementations don't exist in the emotional space.

vvaltchev wrote:Let's forget for a moment what is supposed to be "right" and what is supposed to be "wrong" and just observe what developers did: until compilers allowed something, they took advantage of it.

My first response in the thread conveyed something very similar:

alexfru wrote: Early computers and compilers, though, couldn't do much in terms of code analysis and optimizations and therefore a lot of instances of UB seemed contained and rarely surprising. Technological progress contributed to said analysis and optimizations and made UB bleed, creep and spread further beyond simple operations causing it.
My position is that a lot of code is indeed nonconformant (almost all of it). And it has happened because people have been getting away with it.

You seem to be echoing it...

vvaltchev wrote:It's the same for type punning with casts: it's pretty natural to do that in C and people have been doing it for decades, before and after the ANSI C came out. People stopped doing that in the last decade, after the strict aliasing rule has been enforced.

I'm not sure which part of the aliasing rules is considered strict... But this is the point. Lots of bad code used to be disallowed de jure but allowed de facto (because the guard wasn't employed or born yet). And then gradually de facto started to line up with de jure.

alexfru · Post by **alexfru** » Mon Jun 14, 2021 10:05 pm

Btw, it's perfectly fine to blame the literature on C of the distant past. It exceedingly rarely introduced or properly explained the concept of undefined behavior. I guess, because it felt unnecessary ("unnatural", if you will

) at the time. I did see a number of books ignoring the topic altogether and I didn't have access to the standard back then, so I couldn't know any better. And many others didn't either and very often their code "just worked". Where we are today is just as natural a consequence of those events of the past.

vvaltchev · Post by **vvaltchev** » Tue Jun 15, 2021 3:46 am

Korona wrote:However what you really need to provide evidence for is the claim that this situation fundamentally changed with more recent releases of the C standard.

I believe I've provided such evidence. Let's just agree to disagree. I see no point in continuing the argument with the same data.

Korona wrote:I claim that people also tried to avoid UB in the 80s and 90s and that the interpretation "once your program contains UB, anything can happen" is not new.

At most, I can agree that some kinds of UB existed back then in C outside of the "paper", like "i++ + ++i", buffer overflow etc. It's not that UB didn't exist at all. Just, it existed in some very obvious forms. When I said that in the past "UB wasn't a thing" I didn't mean that it didn't exist at all (I'm sorry if that wasn't clear): I meant that it wasn't a thing to really worry about. Obviously, if you read/write outside of your buffer, you couldn't expect nice things to happen. Same thing if you divide by zero etc, de-reference a NULL pointer etc. But, as nullplan pointed out, generally compilers didn't assume that UB will never happen, as they do today. Also, type punning with casts was a non-portable, but mainstream practice. Again, see examples about how to use the Berkley sockets.

Korona wrote:Your list of programs containing UB is not evidence for that claim, in fact, most of the programs that you listed did not even exist when C99 was developed.

Wait a second. First of all, some of those programs did exist in '90s. Second, when C99 was released nothing really changed the day after. The first time some "modern UB" problems start to emerge was around 2006-2007, several years after C99 was released. But that was just a "slow start". Most of the UB bugs came out in the 2010s, more than 10 years after C99. That's why UBSAN has been introduced. Think about it: if UB was such a problem from the '80s, why we did have to wait 30-40 years for such a tool to be widely available? UB sanitizers were created about the same time (or a little later) after compilers started to aggressively take advantage of UB, in a way that has never been done before in C or C++. GCC was released with an UB sanitizer in version 4.9 (2014). Before that, I remember that I used to compile the same software with clang, just to be able to use its sanitizers, released a few years before that.

It looks like Fortran compilers to some degree anticipated that. It makes sense to me. Actually, Wikipedia's comments on "restrict" makes to believe that it was actually inspired by Fortran:

Wikipedia wrote:The use of this type qualifier allows C code to achieve the same performance as the same program written in Fortran

.

Anyway, when a standard gets released, it takes several years for the compilers to really "catch up". When we started to use C++98 as a standard? Around 2002-2003, right? Certainly not in '98 or '99. When we started compiling with -std=c++11? A few years later. It took a while to get compilers and software projects ready for it. A software project created before 2010 had most of the UB problems as ones created in the '90s, with a few exceptions like signed integer overflow.

vvaltchev · Post by **vvaltchev** » Tue Jun 15, 2021 4:48 am

Solar wrote:It wasn't the idea per se that Ritchie opposed, it was the way it was implemented. And that is only tangential to the topic.

I second Korona. UB existed, UB broke programs (with or without "help" by the compiler), and such programs were considered broken before C89, not to speak of C99. Points in case being made via Fortran 77, Jargon File, and other quite dated references.

I partially answered to this in my previous answer to Korona's comments.

Solar wrote:Unfortunately, it seems you have bitten down hard on this one and refuse to budge even one little bit. Perhaps because you want to keep your pet peeve, perhaps because you are truly convinced you're a man on a mission here. But it makes it a bit tedious arguing the point with you.

Mostly, I felt attacked at the beginning so I needed to defend my position and better clarify in details what it is exactly that I believe. Also, I admit that my initial post was too much provocative and didn't fully reflect what I believe: it was some sort of gross simplification, a little rushed up.

But now, at this point, I see no point in further arguing: we said everything we had to say and we converged a little on some points, while just "agreed to disagree" on others. It's not war I intend to fight, honestly. I have better things to do

Solar wrote:Please state, if you can, what kind of "proof" would make you consent that this "exploitation of UB" is not something introduced by C99, C89, or even plain K&R C? Note that your claim that this is actually the case is tenuous at best, and not really accepted by all here, so it shouldn't take a sworn statement by Mr. Ritchie himself or something like that...

That's an absolutely fair question. I'd say: show me (compiler name, version & code snippet) at least one mainstream C compiler in the '90s that did weird stuff, taking advantage of UB, in the case of:

Unaligned access in general on an architecture that supports that. Infinite points here. Because not even modern compilers fully take advantage of simple plan unaligned access UB, I'd accept also the weaker example of unaligned access with aliasing:
Code: Select all
```
int h(int *p, int *q){
  *p = 1;
  *q = 1;
  return *p;
}
```
as a proof if the 90s compiler did return 1, unconditionally. Note: GCC started to return unconditionally 1 in this case, since version 8.x.
Type-punning with casts. A ton of points here. It don't have time to prepare a small code snippet to test here, but I can say that with modern compilers and strict-aliasing enabled (default behavior, allowed by ISO C), in some uses cases, Linux kernel's linked list implementation generate garbage code because of UB. Maybe this can help: https://stackoverflow.com/questions/648 ... n-cause-ub. Clearly, with "-fno-strict-aliasing" everything works, but that's a non-standard option.

Is there a 90s compiler that can take advantage of this UB?
Singed integer overflow UB will be not considered a "hard proof", but still a point in your favor. The compiler must be released before 1999. Code snippet:
Code: Select all
```
int foo(int x) {
    return x+1 > x;
}
```
If the function returns 1 unconditionally, you get a point.
De-referencing of a NULL pointer. Not a "hard proof", but still a point in your favor. The compiler must be released before 1999. Code snippet (other snippets are accepted as well):
Code: Select all
```
 int foo(void) {
    char *p = 0;
    *p = 'a';
    return 1;
}
```
I played with Compiler Explorer: up to GCC 4.8.5, the compiler just does what we've told it to. Starting from version 4.9, GCC generates UB2. If you can find a '90s compiler that takes advantage of this UB, it will be impressive.
Division by zero. Not a "hard proof", but still a point in your favor. The compiler must be released before 1999. Code snippet (other snippets are accepted as well):
Code: Select all
```
 int foo(int x) {
    return x/0;
}
```
I played with Compiler Explorer: up to GCC 6.3, no UB whatsoever. Starting from GCC 7.1, this gets treated as UB.

I realize that getting the "infinite points" by showing UB because of unaligned access or type-punning with a compiler in the 90s is almost impossible, so I'd say you have STRONG POINT if you can at least prove that a 90s compiler treats as UB all the 3 simpler cases below (integer overflow, NULL pointer de-referencing and division by zero).

OSDev.org

C and undefined behavior

Re: C and undefined behavior

Re: C and undefined behavior

Re: C and undefined behavior

Re: C and undefined behavior

Re: C and undefined behavior

Re: C and undefined behavior

Re: C and undefined behavior

Re: C and undefined behavior

Re: C and undefined behavior

Re: C and undefined behavior

Re: C and undefined behavior

Re: C and undefined behavior

Re: C and undefined behavior

Re: C and undefined behavior

Re: C and undefined behavior