Page 1 of 2

C/C++: Why we go for const-correctness (Rant)

Posted: Tue Nov 28, 2017 3:47 am
by Solar
Far too often I encounter source, or people writing source, steeped in a mindset of "works for me, what are you complaining about".

Because I have a beautiful example on my desk right now, allow me to rant and vent a bit from the position of a maintenance coder.

You know what Const Correctness means? It means that everything that could be "const" qualified, should be "const" qualified.

This is about more than mere pedantry.

Let's say I have a C++ libray before me. Unfortunately C++ is rather peculiar when it comes to cross-compiler compatibility, in a way that C isn't. For a library that is about performing a service (as opposed to a C++ framework, like e.g. Boost), it is beneficial to provide a C interface. This way, it does not matter for your client(s) which compiler you use, or which compiler they use.

It's a shame this hasn't been done with this particular library from the get-go, but that is not the point.

In general, wrapping a C++ ABI in C is not much of a problem. Where a vector or string object is expected, you write a wrapper taking an array (with an additional "count" parameter where needed). Where some object is returned to be used with subsequent functions, you wrap it in a void-pointer typedef. It's all quite straightforward.

The problems start when the C++ interface takes pointers or references to non-const data objects. In my case, a vector< int >. As you are writing the wrapper C functions, you have to ask yourself, "do I need to un-wrap this parameter again to make potential changes visible to the caller, or has someone just forgotten a 'const'?"

So you add the 'const' to the C++ function and run the compiler to see if there are any complaints.

And you get errors from one level down the call hierarchies about "binding 'const std::vector<int>' to reference of type 'std::vector<int>&' discards qualifiers."

So you add 'const' to those functions' parameters.

And get the same errors from the next level down the call hierarchies.

And pretty soon you have touched half the source files of the project, just to ensure that this one parameter from your external API is, indeed, used read-only. One hell of a changeset for a mere wrapper function. Half an hour of pouring through units and modules that are not your responsibility, upsetting several co-workers that have to merge your changes, and generally disturbed productivity.

And then you repeat the whole process for the next parameter / function...

If the original author had gone for const-correctness, writing the wrapper should have been a no-brainer, five-minute exercise.

As with so many Clean Coding exercises, what is dead-easy to do while you're writing the code, becomes a nuisance to retrofit. The same goes for e.g. unit testing, automated source formatting, running code checkers like Valgrind, or even merely using a one-touch build system. All things that bring immediate benefits for comparatively little extra effort... if you model your project that way from the get-go.

Don't postpone Clean Coding to "later". Later never comes. Write clean code from the get-go.

Re: C/C++: Why we go for const-correctness (Rant)

Posted: Tue Nov 28, 2017 8:12 am
by Schol-R-LEA
Hmmn. Food for thought, and not just in terms of C++. It has some interesting and somewhat worrying implications about languages which don't have a concept of constants, or where coercions from const to non-const is automatic and unchecked.

It also gives me some ideas about how I could apply const correctness to my own language plans - and ways to make it easier to check for - especially since my plans already focus on the process of going from untyped or weakly typed code in development to strongly typed code in production. I'll explain more in my own language thread, as it is entirely conceptual - and probably a bit harebrained - at this point and something better kept as an experiment for now.

Re: C/C++: Why we go for const-correctness (Rant)

Posted: Tue Nov 28, 2017 8:20 am
by Solar
It's a bit nasty in C++ as well, since you can always pass a vector< int > to a function expecting a vector< int > const... but a vector< int const > is a completely different type. :twisted: Put that in your refactoring pipe and smoke it...

But at least adding the const to the vector allows you to "feel" your way towards point-of-actual-use, and you can check there if the int's in there get written or not.

Re: C/C++: Why we go for const-correctness (Rant)

Posted: Tue Nov 28, 2017 12:59 pm
by Wajideus
It wouldn't hurt if the compiler made function arguments const by default with the exception of dereferenced pointers. I can't think of a situation where I've ever needed to reassign one of the arguments to a different value.

Also, I think a lot of the const-madness could be avoided if the language allowed captures to be used to purify the scope of any block and if there was better support for memory ownership semantics and compile-time evaluation.

The scenario you unfolded makes it pretty apparent just how much of an anti-pattern that const-correctness really is.

Re: C/C++: Why we go for const-correctness (Rant)

Posted: Tue Nov 28, 2017 1:15 pm
by Schol-R-LEA
As promised, I added an explanation of my own thoughts on a number of related topics, starting here for anyone who missed it. It is... very long. As in, five longish posts long. It is also, as I mentioned earlier, probably more than a little crack-brained. Nonetheless, I would love to hear what you and anyone else have to say about it.

Re: C/C++: Why we go for const-correctness (Rant)

Posted: Tue Nov 28, 2017 4:54 pm
by zaval
Wajideus wrote:It wouldn't hurt if the compiler made function arguments const by default with the exception of dereferenced pointers. I can't think of a situation where I've ever needed to reassign one of the arguments to a different value.

Also, I think a lot of the const-madness could be avoided if the language allowed captures to be used to purify the scope of any block and if there was better support for memory ownership semantics and compile-time evaluation.

The scenario you unfolded makes it pretty apparent just how much of an anti-pattern that const-correctness really is.
For example a function critical by speed, takes an argument in register (mips, arm, x64) and instead of stupidly copy it elsewhere before processing, does its job on the argument directly in that register. Iterating it, adding some value, etc, everything needed to the algorithm it employs. this holds for stack passed arguments too, why unnecessary copy what is supposed to be your variable? but for registers it looks especially wasteful. we either do unneeded copy and loose speed (register->stack, stack->stack), or waste registers (register->register). or both. for nothing useful.

Re: C/C++: Why we go for const-correctness (Rant)

Posted: Tue Nov 28, 2017 5:34 pm
by Schol-R-LEA
zaval wrote:For example a function critical by speed, takes an argument in register (mips, arm, x64) and instead of stupidly copy it elsewhere before processing, does its job on the argument directly in that register. Iterating it, adding some value, etc, everything needed to the algorithm it employs. this holds for stack passed arguments too, why unnecessary copy what is supposed to be your variable? but for registers it looks especially wasteful. we either do unneeded copy and loose speed (register->stack, stack->stack), or waste registers (register->register). or both. for nothing useful.
I am not sure if I am following you, here. Are you suggesting that the callee should have access to the caller's local variables[1], rather than either a copy of them (wherever that copy may be - in specific registers[2] or the stack, but somewhere) or a copy of their address (again, the details of how aren't relevant)?

This sounds like a fine idea - as a link-time[3] optimization for a function known to be called in exactly one place with exactly one set of arguments. Most object file formats don't have support for something like that, however.[4]

I suggest that you think long and hard about what it would mean if that function were called in more than one calling function[5].
  1. Assuming no nested scope. If the callee is inside the lexical scope of the caller, that is a completely different story, but most languages descended from C don't allow nested functions because the language designers felt the the funargs problem was too hairy to bother addressing (though you can use lambda functions in C++, Java 8, and C# to get the same effect, and GCC does allow them in C as a non-portable extension).
  2. Mind you, for register passing, reg-reg copying in the calling function can sometimes be avoided with extremely good register painting, but that's a matter for the compiler developers - register painting is still something of a black art - and even the best algorithms end up spilling some registers some of the time. Oh, and on a CPU with automatic register renaming - such as a modern x86-64, and some top-end AArch64 ARMs (I think) - the CPU can short-circuit most reg-reg copying by simply re-assigning that register's identity to the one that is holding the needed value once the copy operation gets initiated.
  3. Figuring out what information the linker would need to get from the object files in order to do this, why no existing object format (AFAIK) includes those details in any of their standard fields, and why it can't be a compile-time optimization in the general case, are left as exercises for the reader.
  4. My executable format(s) are a lot more likely to have it than most, but I will cross that bridge when I come to it.
  5. Or in the same function twice with different sets of arguments, or in the case of variables on the stack, on separate calls from the same caller if the calling function were itself called more than once (e.g., it had returned and then was called a second time, or if it was a nested recursive call).

Re: C/C++: Why we go for const-correctness (Rant)

Posted: Wed Nov 29, 2017 3:24 am
by Wajideus
zaval wrote:
Wajideus wrote:It wouldn't hurt if the compiler made function arguments const by default with the exception of dereferenced pointers. I can't think of a situation where I've ever needed to reassign one of the arguments to a different value.

Also, I think a lot of the const-madness could be avoided if the language allowed captures to be used to purify the scope of any block and if there was better support for memory ownership semantics and compile-time evaluation.

The scenario you unfolded makes it pretty apparent just how much of an anti-pattern that const-correctness really is.
For example a function critical by speed, takes an argument in register (mips, arm, x64) and instead of stupidly copy it elsewhere before processing, does its job on the argument directly in that register. Iterating it, adding some value, etc, everything needed to the algorithm it employs. this holds for stack passed arguments too, why unnecessary copy what is supposed to be your variable? but for registers it looks especially wasteful. we either do unneeded copy and loose speed (register->stack, stack->stack), or waste registers (register->register). or both. for nothing useful.
I'm not sure how what you're saying pertains to what I said...

Re: C/C++: Why we go for const-correctness (Rant)

Posted: Wed Nov 29, 2017 10:00 am
by Schol-R-LEA
Wajideus wrote:It wouldn't hurt if the compiler made function arguments const by default with the exception of dereferenced pointers. I can't think of a situation where I've ever needed to reassign one of the arguments to a different value.

Also, I think a lot of the const-madness could be avoided if the language allowed captures to be used to purify the scope of any block and if there was better support for memory ownership semantics and compile-time evaluation.

The scenario you unfolded makes it pretty apparent just how much of an anti-pattern that const-correctness really is.
Sorry for getting back on this a bit late, I got distracted by Zaval's post.

I am not sure if I understand what you mean here, and I suspect you may not quite get what Solar was saying.

To clarify: 'const-correctness' is an idiom and coding pattern used primarily in C++, regarding the passing of arguments that are not and/or should not be modified by the called function, or any functions it in turn calls.

It is the principle that if the value of a formal parameter to a function (the variable declared in the function's signature) can be a constant, then it should be declared so explicitly in the function's signature and prototype, so that the compiler can enforce constancy, and check that the arguments (the actual values being passed) are in fact constants.

The second part is the key factor regarding Solar's rant, because it relates to have C++ handles the types of parameters and their corresponding arguments. The general principle of const-correctness (that constant parameters should be declared as such) interacts with the way C++ handles type enforcement, and the different kinds of constantness it allows for.

Now, just for some background, in C (at the time C++ was designed) all parameters are passed by value - a copy of the argument is passed to the function to fill in the parameters, and initially, only scalar arguments could be passed. Furthermore, in very early versions of C, the parameter types were optional - the function signature consisted of just the names of the parameters, and the types came after the signature like so:

Code: Select all

foo(bar, baz)
int bar, char baz;
{
   /* code goes here */
}
Worse still, most C compilers - including the Bell Labs/Unix one - didn't actually enforce argument types or even parameter arity; since the arguments themselves were just a bag-o'-bytes, an argument of a different type would (in effect) be silently coerced to the type of the parameter. Function prototypes[1], when they existed (they weren't required, being there mainly for the sake of the linker), were often just

Code: Select all

foo();
Furthermore, only scalar values could be passed on the stack (which was the assumed method of argument passing). It became an idiom in C for structures, or out parameters (ones which were used to for parameters that were to be altered by the function) to be passed as pointers to the argument, instead.

In 1979, when Bjarne Stroustrup started designing what would become C++ - which up until 1983 or so was just a preprocessor for C called 'CFront' - he sensibly decided that this was a terrible idea[2] and added a lot more type checking in CFront. He also added the 'reference' type, to allow for out parameters to be passed without them being explicitly pointers, and added support for passing structures and classes directly.

However, this was still evolving well into the late 1980s, and when ISO decided that it too needed to be standardized, some time around 1993, the it was already a bit of a mess. It would only get more complicated as the standardization proceeded, as the need to clarify the interactions trumped the need for simplicity in the language.

The result of this is, as Solar mentioned, that "const int *foo;"
and "int const *foo;" are not the same declaration, the latter saying that "foo" is constant, the other saying that "foo" is a variable, but can only be assigned constant addresses.

This is where const-correctness in C++ comes in, as it is a rule of thumb about how to be consistent about what the parameters themselves should be declared as, and how to use the type-checking to makes sure it does what you want.

So, I am not sure if you are saying the this idiom is an anti-pattern in C++ programming, or if the language design 'features' that led to it are an anti-pattern for language design.
  1. Note also that the default type for returns was int, and the practice was that for functions that has no return, or returned an int, no return type was specified. The confusion this caused led to the first standards committee deciding that they needed a void type, something which I am pretty sure was done in C++ from the start but hadn't existed in C before - work on the first version of the C standard was started in 1985, around the same time that "C with Classes" was renamed "C++" and started drawing interest outside of Bell Labs, though the standard wasn't finalized until 1988.
  2. It seems obvious in hindsight, but you have to remember that in 1968 - when strong typing was a new and somewhat contentious idea - Kernighan, Thompson, and Ritchie were writing a language for their own personal use in system development, one they never expected anyone who wasn't a Bell Labs engineer to ever see it; by the time they decided to publicize it, things were already set. They resisted even simple fixes like changing "=+" to "+=" for years, and only made that particular change because it made parsing less of a mess.

Re: C/C++: Why we go for const-correctness (Rant)

Posted: Wed Nov 29, 2017 10:24 am
by Wajideus
I'm well aware of what const-correctness is and the history of C/C++. What I'm saying amounts to "sorry Mario, the princess is in another dungeon".

One of the reasons why we use const correctness is so that functions can be used on const and non-const objects. In this sense, "const" doesn't mean "immutable"; it means "known at compile-time". What we really wanted to do was allow a function to possibly be evaluated at compile-time, given the fact that it's arguments are also known at compile-time. This is evident by the eventual addition of 'constexpr' to the language.

The second reason why we use const correctness is to prevent bugs caused by the mutation of state. Sometimes, you want a function to only be able to read the object. Sometimes you want to let it both read and write the object. Of course, the second case has additional complications. If you pass a writable object to a function, it might promote the scope of the object and lead to a dangling pointer. As such, this is a memory ownership problem.

If you address memory ownership and compile-time evaluation, you can eliminate most, if not all, use cases for const.


EDIT:
Also,
The result of this is, as Solar mentioned, that "const int *foo;"
and "int const *foo;" are not the same declaration, the latter saying that "foo" is constant, the other saying that "foo" is a variable, but can only be assigned constant addresses.
This is wrong. "const int *foo" and "int const *foo" mean the exact same thing. To make the address constant, it has to be written "int *const foo;"

Re: C/C++: Why we go for const-correctness (Rant)

Posted: Wed Nov 29, 2017 10:44 am
by Schol-R-LEA
Wajideus wrote:I'm well aware of what const-correctness is and the history of C/C++. What I'm saying amounts to "sorry Mario, the princess is in another dungeon".

.
Ah, OK, then. You probably know more than I do given what you said after this.

Unfortunately, I had to rush out before I could add my real point, which was that Solar wasn't complaining about the language, or the idiom, but about his cow-orkers (hey, orking cows gets some odd folks). I am guessing now that you did understand that, but when I was writing this I wasn't sure.
Wajideus wrote:EDIT:
Also,
The result of this is, as Solar mentioned, that "const int *foo;"
and "int const *foo;" are not the same declaration, the latter saying that "foo" is constant, the other saying that "foo" is a variable, but can only be assigned constant addresses.
This is wrong. "const int *foo" and "int const *foo" mean the exact same thing. To make the address constant, it has to be written "int *const foo;"
Erk. It has been too long since I used either C or C++ on a regular basis, I have to watch that kind of thing.

Re: C/C++: Why we go for const-correctness (Rant)

Posted: Wed Nov 29, 2017 1:47 pm
by zaval
to School-L-Rea. I meant only this:
if this:

Code: Select all

int Func(char X);
"by default" would mean this:

Code: Select all

int Func(const char X);
then this

Code: Select all

int Func(char X){
    X+= 0x20;
    ...
}
will be illegal, and one would need something like this:

Code: Select all

int Func(char X){
    char Y = X;
    Y += 0x20;
}
For example on mips, you would get the X inside of $a0 for your usage. But with that (stupid) requirement to treat it as a constant, you would need for example copy its value somewhere before modifying $a0:

Code: Select all

Func:
    ...
    sw $a0, Somewhere($sp)
    addiu $a0, $a0, 0x20
    ...
or faster, yet as dumb as previous

Code: Select all

Func:
    ...
    or $s0, $zero, $a0
    addiu $a0, $a0, 0x20
    ...
Of course, you could say, that compiler will optimize that out, but this only proves the idea is not needed and only interferring with making code better.
In any case, this is for nothing useful. Honestly, I think at least for C using const for arguments is a plain dumb nonsense.
Wajideus wrote: I'm not sure how what you're saying pertains to what I said...
Really? Ok, what did you mean by "making arguments const by default"?

Re: C/C++: Why we go for const-correctness (Rant)

Posted: Wed Nov 29, 2017 4:02 pm
by Solar
Schol-R-LEA wrote:
Wajideus wrote:EDIT:
Also,
The result of this is, as Solar mentioned, that "const int *foo;"
and "int const *foo;" are not the same declaration, the latter saying that "foo" is constant, the other saying that "foo" is a variable, but can only be assigned constant addresses.
This is wrong. "const int *foo" and "int const *foo" mean the exact same thing. To make the address constant, it has to be written "int *const foo;"
Erk. It has been too long since I used either C or C++ on a regular basis, I have to watch that kind of thing.
With one exception, in C++ "const" always refers to the previous item. Types, member functions etc., you always write the "const" after whatever it is meant to refer to.

The one exception is the legacy from C, that "TYPE const" could also be written as "const TYPE".

Unfortunately, that one exception is what people encounter first, and keep being confused about the many "exceptions" to the "rule" for a looong time. ;-)

That's why I recommend not using that one exception, to use "trailing" const exclusively. You get used to it pretty quickly that way.

Re: C/C++: Why we go for const-correctness (Rant)

Posted: Wed Nov 29, 2017 10:29 pm
by Wajideus
zaval wrote:to School-L-Rea. I meant only this:
if this:

Code: Select all

int Func(char X);
"by default" would mean this:

Code: Select all

int Func(const char X);
then this

Code: Select all

int Func(char X){
    X+= 0x20;
    ...
}
will be illegal, and one would need something like this:

Code: Select all

int Func(char X){
    char Y = X;
    Y += 0x20;
}
For example on mips, you would get the X inside of $a0 for your usage. But with that (stupid) requirement to treat it as a constant, you would need for example copy its value somewhere before modifying $a0:
This isn't how register allocation works. There's not a 1:1 correspondence between variables and regsisters. They're allocated and freed based on liveness. That aside, your example isn't practical in any way. There's no reason anyone would ever want or need to do that.
zaval wrote:
Wajideus wrote: I'm not sure how what you're saying pertains to what I said...
Really? Ok, what did you mean by "making arguments const by default"?
You just started out your response with "For example ...", i wasn't sure if you were arguing against my statement or backing it up.

Re: C/C++: Why we go for const-correctness (Rant)

Posted: Sat Dec 02, 2017 12:23 am
by Schol-R-LEA
@zaval: I seen what you meant now, at least somewhat, and the point is... well, it's a point, but it fails to take two things into account.

First, the suggestion that constness should be the default doesn't imply that variable parameters need to be forbidden outright - it just would mean that instead of something like 'const' as the modifier, you would have a modifier showing that it was mutable. The question was whether this made a better default or not.

Second, while what you describe might be a headache to code, and would have caused inefficiencies in older CPU models, most modern high-performance CPUs have some amount of Out-of-order processing at the hardware level - in most cases, the CPU will schedule the variable transfer well in advance, perhaps even before the call to the function itself if the call is not guarded by a branch (and maybe even if it is if there is multipath precomputation), meaning that the transfer (or register renaming, depending) would be pipelined well in adnace, so that it takes place while other instructions are doing longer tasks while other instructions are being performed - potentially avoiding or shortening a pipeline stall. Even if the hardware can't reorder it, the compiler often can schedule it in a way so that the cycles spent on it are shadowed by other operations. There might still be a cycle spent on it, but often, there won't even be that in terms of the number of cycles spent on the whole function.