GCC optimisation - combing strings

Programming, for all ages and all languages.
Post Reply
gerryg400
Member
Member
Posts: 1801
Joined: Thu Mar 25, 2010 11:26 pm
Location: Melbourne, Australia

GCC optimisation - combing strings

Post by gerryg400 »

I have been chasing a strange bug in my libc implementation of dirname().

The function that calls dirname does this

Code: Select all

    orig_name = "/usr//";
    name = dirname(orig_name);
    kprintf("%s\n", name);
According to Posix dirname is allowed to modify the string passed to it. I've never been happy about that but that's another story. The other thing dirname does is that sometimes it returns a pointer to a static literal string "/".

GCC combines my 2 literal strings together so that the "/" occupies the last character and NULL of the "/usr//". This is of course not good for dirname because it modifies the "/usr//" thus trashing the "/" it would sometimes return.

These 2 strings are defined in different files so I'm not sure how GCC even manages to do this. Anyone seen this before ? Or have I misinterpreted the whole thing ?
If a trainstation is where trains stop, what is a workstation ?
CelestialMechanic
Member
Member
Posts: 52
Joined: Mon Oct 11, 2010 11:37 pm
Location: Milwaukee, Wisconsin

Re: GCC optimisation - combing strings

Post by CelestialMechanic »

Shouldn't your first slash also be doubled?

orig_name = "//usr//";
Microsoft is over if you want it.
User avatar
Owen
Member
Member
Posts: 1700
Joined: Fri Jun 13, 2008 3:21 pm
Location: Cambridge, United Kingdom
Contact:

Re: GCC optimisation - combing strings

Post by Owen »

C string constants are... constant. If you're passing one to dirname, which modifies it, your code is obviously incorrect (because its passing a constant string to a function which operates upon it in a non-constant manner)
User avatar
Solar
Member
Member
Posts: 7615
Joined: Thu Nov 16, 2006 12:01 pm
Location: Germany
Contact:

Re: GCC optimisation - combing strings

Post by Solar »

gerryg400 wrote:GCC combines my 2 literal strings together so that the "/" occupies the last character and NULL of the "/usr//". This is of course not good for dirname because it modifies the "/usr//" thus trashing the "/" it would sometimes return.
After five minutes I gave up trying to make sense out of this one. There's only one literal string in your code ("/usr//"), aside from the printf() format string.

Oh, and what Owen said, of course. The whole thing shouldn't even compile.
Every good solution is obvious once you've found it.
gerryg400
Member
Member
Posts: 1801
Joined: Thu Mar 25, 2010 11:26 pm
Location: Melbourne, Australia

Re: GCC optimisation - combing strings

Post by gerryg400 »

Owen wrote:C string constants are... constant.
No, they are not. Well they are not 'const'.
Last edited by gerryg400 on Tue Feb 28, 2012 2:26 am, edited 1 time in total.
If a trainstation is where trains stop, what is a workstation ?
gerryg400
Member
Member
Posts: 1801
Joined: Thu Mar 25, 2010 11:26 pm
Location: Melbourne, Australia

Re: GCC optimisation - combing strings

Post by gerryg400 »

Solar wrote:
gerryg400 wrote:GCC combines my 2 literal strings together so that the "/" occupies the last character and NULL of the "/usr//". This is of course not good for dirname because it modifies the "/usr//" thus trashing the "/" it would sometimes return.
After five minutes I gave up trying to make sense out of this one. There's only one literal string in your code ("/usr//"), aside from the printf() format string.

Oh, and what Owen said, of course. The whole thing shouldn't even compile.
The other string (the "/") is buried in the code in my C library.

Look it's true that the code I wrote is not great. I deliberately wrote it that way to test my C library. I didn't think the code would work but the mode of failure surprised me.
If a trainstation is where trains stop, what is a workstation ?
User avatar
Owen
Member
Member
Posts: 1700
Joined: Fri Jun 13, 2008 3:21 pm
Location: Cambridge, United Kingdom
Contact:

Re: GCC optimisation - combing strings

Post by Owen »

...It's the kind of thing that GCC has a warning for ("-Wconstant-strings" I think?), assuming your dirname() is correctly prototyped
gerryg400
Member
Member
Posts: 1801
Joined: Thu Mar 25, 2010 11:26 pm
Location: Melbourne, Australia

Re: GCC optimisation - combing strings

Post by gerryg400 »

Yes, it's "-Wwrite-strings".

By default, both MS and GCC compilers overlook this warning.
If a trainstation is where trains stop, what is a workstation ?
User avatar
Solar
Member
Member
Posts: 7615
Joined: Thu Nov 16, 2006 12:01 pm
Location: Germany
Contact:

Re: GCC optimisation - combing strings

Post by Solar »

gerryg400 wrote:No, they are not. Well they are not 'const'.
They are "const" in the meaning that writing to a string literal is undefined behaviour (C99 chapter 6.4.5). Modern compilers actually make them const (if run with appropriate settings).
gerryg400 wrote:The other string (the "/") is buried in the code in my C library.
I still have no idea what exactly your problem is, because the description of the problem is not understandable for me.

Guess:

You have a string literal "/usr//" in one place, and "/" in another. The compiler did not create a seperate object for "/", but instead pointed at the last character of "/usr//", which it is at liberty to do. (This kind of optimization being part of the reason why you aren't allowed to write to string literals.)
Every good solution is obvious once you've found it.
gerryg400
Member
Member
Posts: 1801
Joined: Thu Mar 25, 2010 11:26 pm
Location: Melbourne, Australia

Re: GCC optimisation - combing strings

Post by gerryg400 »

The problem is that the 2 string literals occupy the same piece of memory. The literal "/usr//" is at address P and the "/" is at address P+5. If dirname modifies one the other changes. Because they are in separate objects files, this surprised me.
If a trainstation is where trains stop, what is a workstation ?
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: GCC optimisation - combing strings

Post by Combuster »

The linker can optimise too :wink:
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
User avatar
JamesM
Member
Member
Posts: 2935
Joined: Tue Jul 10, 2007 5:27 am
Location: York, United Kingdom
Contact:

Re: GCC optimisation - combing strings

Post by JamesM »

gerryg400 wrote:The problem is that the 2 string literals occupy the same piece of memory. The literal "/usr//" is at address P and the "/" is at address P+5. If dirname modifies one the other changes. Because they are in separate objects files, this surprised me.
As combuster said, the linker can optimise too, and as everyone said and as a compiler developer:

Don't rely on undefined behaviour or velociraptors will devour what my colt .45 leaves untouched.

String literals are constant. End of story. They are placed in ".rodata" (do you see the hint in that section name?) with ELF properties such that they should be readable but not writeable or executable.

Use strdup().
Post Reply