Page 1 of 1

GCC optimisation - combing strings

Posted: Mon Feb 27, 2012 7:56 pm
by gerryg400
I have been chasing a strange bug in my libc implementation of dirname().

The function that calls dirname does this

Code: Select all

    orig_name = "/usr//";
    name = dirname(orig_name);
    kprintf("%s\n", name);
According to Posix dirname is allowed to modify the string passed to it. I've never been happy about that but that's another story. The other thing dirname does is that sometimes it returns a pointer to a static literal string "/".

GCC combines my 2 literal strings together so that the "/" occupies the last character and NULL of the "/usr//". This is of course not good for dirname because it modifies the "/usr//" thus trashing the "/" it would sometimes return.

These 2 strings are defined in different files so I'm not sure how GCC even manages to do this. Anyone seen this before ? Or have I misinterpreted the whole thing ?

Re: GCC optimisation - combing strings

Posted: Tue Feb 28, 2012 12:26 am
by CelestialMechanic
Shouldn't your first slash also be doubled?

orig_name = "//usr//";

Re: GCC optimisation - combing strings

Posted: Tue Feb 28, 2012 1:26 am
by Owen
C string constants are... constant. If you're passing one to dirname, which modifies it, your code is obviously incorrect (because its passing a constant string to a function which operates upon it in a non-constant manner)

Re: GCC optimisation - combing strings

Posted: Tue Feb 28, 2012 1:32 am
by Solar
gerryg400 wrote:GCC combines my 2 literal strings together so that the "/" occupies the last character and NULL of the "/usr//". This is of course not good for dirname because it modifies the "/usr//" thus trashing the "/" it would sometimes return.
After five minutes I gave up trying to make sense out of this one. There's only one literal string in your code ("/usr//"), aside from the printf() format string.

Oh, and what Owen said, of course. The whole thing shouldn't even compile.

Re: GCC optimisation - combing strings

Posted: Tue Feb 28, 2012 2:20 am
by gerryg400
Owen wrote:C string constants are... constant.
No, they are not. Well they are not 'const'.

Re: GCC optimisation - combing strings

Posted: Tue Feb 28, 2012 2:25 am
by gerryg400
Solar wrote:
gerryg400 wrote:GCC combines my 2 literal strings together so that the "/" occupies the last character and NULL of the "/usr//". This is of course not good for dirname because it modifies the "/usr//" thus trashing the "/" it would sometimes return.
After five minutes I gave up trying to make sense out of this one. There's only one literal string in your code ("/usr//"), aside from the printf() format string.

Oh, and what Owen said, of course. The whole thing shouldn't even compile.
The other string (the "/") is buried in the code in my C library.

Look it's true that the code I wrote is not great. I deliberately wrote it that way to test my C library. I didn't think the code would work but the mode of failure surprised me.

Re: GCC optimisation - combing strings

Posted: Tue Feb 28, 2012 2:28 am
by Owen
...It's the kind of thing that GCC has a warning for ("-Wconstant-strings" I think?), assuming your dirname() is correctly prototyped

Re: GCC optimisation - combing strings

Posted: Tue Feb 28, 2012 2:33 am
by gerryg400
Yes, it's "-Wwrite-strings".

By default, both MS and GCC compilers overlook this warning.

Re: GCC optimisation - combing strings

Posted: Tue Feb 28, 2012 2:42 am
by Solar
gerryg400 wrote:No, they are not. Well they are not 'const'.
They are "const" in the meaning that writing to a string literal is undefined behaviour (C99 chapter 6.4.5). Modern compilers actually make them const (if run with appropriate settings).
gerryg400 wrote:The other string (the "/") is buried in the code in my C library.
I still have no idea what exactly your problem is, because the description of the problem is not understandable for me.

Guess:

You have a string literal "/usr//" in one place, and "/" in another. The compiler did not create a seperate object for "/", but instead pointed at the last character of "/usr//", which it is at liberty to do. (This kind of optimization being part of the reason why you aren't allowed to write to string literals.)

Re: GCC optimisation - combing strings

Posted: Tue Feb 28, 2012 2:48 am
by gerryg400
The problem is that the 2 string literals occupy the same piece of memory. The literal "/usr//" is at address P and the "/" is at address P+5. If dirname modifies one the other changes. Because they are in separate objects files, this surprised me.

Re: GCC optimisation - combing strings

Posted: Tue Feb 28, 2012 2:55 am
by Combuster
The linker can optimise too :wink:

Re: GCC optimisation - combing strings

Posted: Tue Feb 28, 2012 2:20 pm
by JamesM
gerryg400 wrote:The problem is that the 2 string literals occupy the same piece of memory. The literal "/usr//" is at address P and the "/" is at address P+5. If dirname modifies one the other changes. Because they are in separate objects files, this surprised me.
As combuster said, the linker can optimise too, and as everyone said and as a compiler developer:

Don't rely on undefined behaviour or velociraptors will devour what my colt .45 leaves untouched.

String literals are constant. End of story. They are placed in ".rodata" (do you see the hint in that section name?) with ELF properties such that they should be readable but not writeable or executable.

Use strdup().