nullplan wrote:The only pointers you can definitely form are into arrays, and one element behind an array. So technically even casting addresses (numbers) into pointers is UB.
Neither of which is strictly correct. You can form pointers to anything. You can't do pointer
arithmetic on pointers that aren't into the same array (and one element beyond). You can cast pointers to intptr_t or an array of unsigned char (object representation) and back, no problem.
eekee wrote:So there are 3 languages called C: K&R C, ANSI C, and pointers-aren't-real C. What a mess!
No. Actually, there is only one language, C. Which has been formed into the international standard ISO/IEC 9899, of which several iterations have been released. Any superceded version, as well as any pre-standard "version" (K&R), is considered
outdated. Which doesn't mean that there isn't lots of code for that version still around, or that it would be "wrong" to code against that version. It's a choice you make, based on available tools and resources.
eekee wrote:I've figured out why my friend thought arrays should work like that. They were local variables, which are (in Real C
) allocated on the stack. There's no possibility of holes in the stack.
Note that C
has no concept of "stack". It simply isn't part of the language. You think of local variables as "being on the stack", you've fallen into the trap of assumptions. Know what the language actually guarantees.
Do not rely on a given implementation. Things happen to change from time to time -- consciously and well-advertised in case the standard itself makes breaking changes (which it seldom does). Possibly surprising and non-obvious at first glance if you've made assumptions.
eekee wrote:Uhm... something I picked up over the years, and heard over and over again, is that you should NEVER access the same memory location from different threads without some means of synchronization. It's all right to do it from coroutines, but NEVER from parallel-executing threads. I've heard over and over again, "this can't be emphasized strongly enough." I can understand why: it's basically undefineable behavior.
Amen.
eekee wrote:So it's all right for optimization to arbitrarily change the meaning of the code unless explicitly but cryptically told otherwise?
No.
The meaning of the code is quite clear: There's a variable being tested with no way it could change between tests given the restraints layed out by the language definition, so the compiler is
explicitly allowed to shift the logic around. The rule here is called "as if": As long as the resulting code behaves "as if" without the optimization, the optimization is fine.
Two things to note here:
- Prior to C11, the standard had no concept of multiple control flows. Any such mechanics and definitions were extensions, usually through third-party libraries. As a corollary, the optimizer didn't have to "worry" about what some other thread might do. (That's what your "never access memory..." advice is about -- unless you actually use the mechanics offered by such extensions, such as <pthread.h>, the compiler may play dumb and just ignore such a thing as multithreading exists.) With C11, the memory model was finally well-defined... but that just means the standard now tells you explicitly that you have to be specific if you intend to access memory from multiple control flows.
- The "volatile" keyword is anything but cryptic; it has the explicit and sole purpose of telling the compiler that this is a value that may chance outside the control flow of the program, and may as such not be optimized e.g. out of a loop's condition.
eekee wrote:And note the volatile declaration could be far from the loop.
As would be the type declaration, of which "volatile" is a part. Your meaning?
eekee wrote:There's nothing at all ambiguous about a conditional expression as an argument to a loop; no language should arbitrarily change its meaning for any reason!
Well, without "volatile" in there, the point is that optimizing the conditional out of the loop (for a
massive performance increase!)
is not changing the logic of the loop.
Consider:
Code: Select all
int get_max( void )
{
// TODO: Change this later.
return 100;
}
int main()
{
for ( int i = 0; i < get_max(); ++i )
{
do_something();
}
}
Let's say we've written it this way because we know get_max() will be replaced with something complicated later on, and wanted to avoid having to make changes everywhere this value is used when the time comes.
A compiler
could have that code call get_max() a hundred times. Or it could be smart enough to realize that the return value of get_max() is a constant expression (for now), and thus optimize it away.
This is what makes C (much) better at optimizing than Assembler.
And it's the same with your loop over a condition of which the compiler
knows that it is
not volatile.
eekee wrote:lol I'm so glad I chose not to use C!
Tell me which language you use, and I (or someone else here) will tell you some intricacies of that language that are just as not-quite-that-obvious to someone who doesn't know it well enough yet.
eekee wrote:I can't get on with the older C code of Plan 9, latest C is
like this, *rage!* lol, so just... burn it all, I'll use something else.
Let's get back to an earlier quote of you:
eekee wrote:And note the volatile declaration could be far from the loop.
"Older" C required you to make all declarations at the top of a code block.
Languages evolve. If you try to keep things like they were yesteryear, you'll be left by the roadside. C89, C99, C11, C17. C++98, C++11, C++14, C++17. Java 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11. Python 1, 2, 3. And so on. Adapt.
eekee wrote:With all this undefined behaviour, I wonder if pointers-aren't-real C may have introduced more pitfalls than just using pointers the way they were meant to be used.
Don't be fooled. The places where behavior of C code is "undefined" are usually well described, and well understood. The whole purpose here was efficiency, allowing for minimal implementation and runtime, and still relieving the programmer from having to think of every micro-optimization himself. It also allows the standard library to cut quite some corners, again adding to the efficiency of the implementation.
Compare, for example, memcpy() vs. memmove(). C is
very much a language that requires you to bring your own safety net if you want one, because the language doesn't provide it. If you need a safety net on general principles,
use a different language.
----
I looked up the relevant part of the language specs for your reading pleasure.
ISO/IEC 9899:2011, chapter 5.1.2.3 Program execution, 9-10 wrote:
9 An implementation might define a one-to-one correspondence between abstract and actual semantics: at every sequence point, the values of the actual objects would agree with those specified by the abstract semantics. The keyword volatile would then be redundant.
10 Alternatively, an implementation might perform various optimizations within each translation unit, such that the actual semantics would agree with the abstract semantics only when making function calls across translation unit boundaries. In such an implementation, at the time of each function entry and function return where the calling function and the called function are in different translation units, the values of all externally linked objects and of all objects accessible via pointers therein would agree with the abstract semantics. Furthermore, at the time of each such function entry the values of the parameters of the called function and of all objects accessible via pointers therein would agree with the abstract semantics. [...]
At which point I have to mollify my earlier criticism -- the side-effect the usleep() call had on the program was actually well-defined. A volatile declaration would still have been better (and more stable).