Page 1 of 2

joining 2 strings

Posted: Fri May 06, 2005 3:27 pm
by rich_m
i need to join two strings using the '+' operator.
i written the following which seems to be infested with bugs.

Code: Select all

Modified:--
class string
{
   private:
      char* str;

   public:
   string()
   {
      str=new char[1];
      str[0]='\0';
   }
   string(const char* s)
   {
      int len=strlen(s);
      str=new char[len+1];
      strcpy(str,s);
   }
   int length()
   {
      return(strlen(str));
   }
   void input()
   {
      char t[100];
      cin>>t;
      int len=strlen(t)+1;
      str=new char[len];
      strcpy(str,t);
   }
   const char* text()
   {
      return(str);
   }
   string operator + (string s2)
   {

      string t;
      int tlen=this->length()+s2.length()+1;
      t.str=new char[tlen];
      strcpy(t.str,this->str);
      strcat(t.str,s2.str);

      return(t);
   }
};
plz help.

Re:joining 2 strings

Posted: Fri May 06, 2005 6:19 pm
by Colonel Kernel
plz? ::)

There are a lot of problems with this code. I've added comments where I see problems:

Code: Select all

#include"iostream.h"
#include"string.h"
#include"conio.h"

class string
{
   private:
      char* str;

   public:
   string()
   {
      str=new char[1];

      // Memory leak. You're making str point to a
      // different string without freeing the one
      // you just allocated.
      str="";
   }

   // The parameter should be const char*, since you
   // do not need to modify the string pointed to by s.
   string(char* s)
   {
      int length=strlen(s);
      str=new char[length+1];
      strcpy(str,s);
   }
   string(int m)
   {
      // This leaves the memory pointed to by str
      // uninitialized. This is a big problem if
      // someone calls text expecting the result
      // to be a null-terminated C-style string.
      str=new char[m+1];
   }
   int length()
   {
      // This will break horribly if str is not
      // null terminated -- see the constructor
      // above.
      return(strlen(str));
   }
   void input()
   {
      // I don't think this does what you intend
      // it to do. I can't test it ATM, but to
      // me it looks like it's waiting for the
      // user to type in an address to put in
      // str.
      cin>>str;
   }
   char* text()
   {
      // str MUST be null-terminated or this
      // becomes a really bad idea.
      return(str);
   }

   // I would probably make this a friend
   // function that takes two string parameters
   // instead of making it a member function.
   // This is more of a style thing though.
   // Also, s2 should be a const string& for
   // the sake of efficiency.
   string operator + (string s2)
   {

      char* t;
      t=new char[length()+1];
      string tmp;
      tmp.str=new char[ length() + s2.length() + 1];
      strcpy(t,this->str);

      // The strcat here is going to break something,
      // since you didn't allocate enough space to
      // hold the contents of both s1 and s2. I
      // think you meant to strcat directly into
      // tmp.str. If you think about it, there's
      // no need for t at all here.
      strcpy(tmp.str,strcat(t,s2.str));
      delete t;
      return(tmp);
   }
   ~string()
   {
      // You allocated with array new, so you must
      // delete with array delete: delete [] str
      delete str;
   }
};
void main()
{
   string a="hello";
   string b=" world";
   string c;
            c=a+b;
   cout<<c.text()<<endl;
   
   getch();
}
In addition to the above, you're also missing a copy constructor and assignment operator, which is the kiss of death for classes that are supposed to have deep-copy semantics.

Imagine what happens in this case:

Code: Select all

int main()
{
    string a = "foo";
    string b = a;
    return 0;
}
After the fields of a are bitwise copied into b (which is what happens when you don't have a copy constructor in this case), a.str and b.str both point to the same buffer. What do you suppose will happen when both of their destructors run? BOOM. Or in language lawyer terms, undefined behaviour.

Re:joining 2 strings

Posted: Sat May 07, 2005 1:07 pm
by rich_m
well it seems to work fine when the destructor is removed and btw how do i create a copy constructor?
and suppose i want to be able to right code like this

Code: Select all

string a;
a="Foo";
will i have to overload the '=' operator?

Re:joining 2 strings

Posted: Sat May 07, 2005 2:40 pm
by Colonel Kernel
rich_m wrote: well it seems to work fine when the destructor is removed and btw how do i create a copy constructor?
and suppose i want to be able to right code like this

Code: Select all

string a;
a="Foo";
will i have to overload the '=' operator?
If you remove the destructor, you will leak memory. This is bad. Just because your program seems to work does not automatically make it correct.

Based on how you've approached this, I'd say you're doing a lot of guessing. IMO you're better off getting a good book on C++.

Yes, you have to overload the = operator. You should do that anyway, otherwise you will experience the problems I've described.

Here is how I would implement a copy constructor and assignment operator for this class:

Code: Select all

class string
{
...
public:

    string( const string& other )
    {
        str = new char[other.length() + 1];
        strcpy( str, other.str );
    }

    string& operator=( const string& other )
    {
        if (&other != this)
        {
            // Copy the contents of the other string into
            // a temporary object. If something goes wrong,
            // at least the contents of this object remain
            // intact.
            string tmpString( other );

            // Swap the fields of this object for those in the
            // temporary copy. This is efficient, exception-safe,
            // and tidy since it ensures that the old contents
            // of this object will be freed by tmpString's
            // destructor when tmpString goes out of scope.
            swap( tmpString );            
        }
        return *this;
    }

private:

    void swap( string& other )
    {
        char* tmpStr = str;
        str = other.str;
        other.str = tmpStr;
    }
...
};
I would actually use std::swap inside of string::swap instead of doing it by hand, but I figured you would get more out of seeing how the technique works in detail.

Actually, I would use std::string instead of writing my own, but it's a good place to start learning. :)

The use of "swap" is a trick I picked up by reading one of Scott Meyers' books (I think it was "Effective STL", but I don't remember).

Re:joining 2 strings

Posted: Sun May 08, 2005 10:37 am
by rich_m
Colonel Kernel wrote: If you remove the destructor, you will leak memory. This is bad. Just because your program seems to work does not automatically make it correct.
any idea why this function works only when the destructor is removed?

Code: Select all

string operator + (string s2)
  {

      string t;
      int tlen=this->length()+s2.length()+1;
      t.str=new char[tlen];
      strcpy(t.str,this->str);
      strcat(t.str,s2.str);

      return(t);
  }

~string()
{
       delete str;
}


Re:joining 2 strings

Posted: Sun May 08, 2005 10:58 am
by Colonel Kernel
rich_m wrote: any idea why this function works only when the destructor is removed?
Have you added the copy constructor and operator= yet? If not, then see my previous post:
After the fields of a are bitwise copied into b (which is what happens when you don't have a copy constructor in this case), a.str and b.str both point to the same buffer. What do you suppose will happen when both of their destructors run? BOOM. Or in language lawyer terms, undefined behaviour.
If you delete the same chunk of memory twice, you get undefined behaviour (i.e. -- all hell breaks loose).

Also, you're still missing the [] in your delete:

Code: Select all

    ~string()
    {
        delete [] str;
    }
Either of these could cause big problems.

Re:joining 2 strings

Posted: Mon May 09, 2005 1:34 pm
by rich_m
Colonel Kernel wrote:
string& operator=( const string& other )
{
if (&other != this)
{
it seems that the string 'other' is being compared with the 'this' string using the '!=' operator, right?

Re:joining 2 strings

Posted: Mon May 09, 2005 2:32 pm
by Colonel Kernel
It's not comparing the objects -- it's comparing their addresses to make sure they aren't the same object. There's no sense in assigning an object to itself.

Re:joining 2 strings

Posted: Mon May 09, 2005 2:54 pm
by Schol-R-LEA
If you don't mind me asking, is there a particular reason why you aren't able to use the standard [tt]string[/tt] class, and if so, is there a reason you aren't you writing it as a subset of the standard class (i.e., some of your classes don't match the standard ones)? Any answer is legitimate, but knowing the purpose might help us with understanding the goals.

One of the more serious problems I see is with memory leaks; not surprising, as those are one of the largest headaches in C++ for anyone. You really have to ensure that any memory you allocate gets deallocated later.

I would seriously consider using a [tt]length[/tt] instance variable and dropping the zero-delimiting. While this would mean you couldn't use the standard c-string library, and that you'd have to do a bit of fiddling about in your [tt]text()[/tt] method (BTW, the name for this function in the standard [tt]string[/tt] class is [tt]c_str()[/tt]), it would save some headaches in other places. Even if you use zero-delimited strings, I would still have a length variable. The size isn't going to change, after all (except perhaps in the assignment operator, though I recommend a way to avoid it even there below), and while it takes up an extra four bytes or so of space, it avoids the repeated [tt]strlen()[/tt] calls, which is a Good Thing. When you do use the standard c-string function, I would recommend using the n variants ([tt]strncpy()[/tt], strncat()[/tt], etc); once again, you already know the sizes of the strings, so you might as well take advantage of that for a bit of extra safety.

I agree with Colonel Kernel about making the concatenate operator a friend function rather than a method, mainly because you are not changing the existing objects, but creating a new one. I would use your existing constructor to handle the actual object creation.

This brings up an important point: any concatenation operator that isn't passed an intermediate buffer to fill must dynamically generate the concatenated string as a new string object. This would have to be returned as a string pointer or a string reference, not a string per se.

Regarding the copy constructor, I disagree with CK here. You don't want to do the copying in the assignment method; rather, you will want to create an actual copy constructor (that is, a constructor that takes an object of the same class and creates a second duplicate object) and then create a friend assignment operator using that:

Code: Select all

// this assumes your current class design
// note that in a copy constructor, the
// argument *must* be a ref variable

string::string(string& src)
{
    if (NULL == src.str)
    {
        str = NULL;
    }
    else 
    {
        int len = src.length() + 1;
        str = new str[len];
        strncpy(str, src.str, len);
    }
}
You'll probably want a comparison operator as well, both for your [tt]string[/tt] objects and for c-strings.

Re:joining 2 strings

Posted: Mon May 09, 2005 3:17 pm
by rich_m
Schol-R-LEA wrote: If you don't mind me asking, is there a particular reason why you aren't able to use the standard [tt]string[/tt] class, and if so, is there a reason you aren't you writing it as a subset of the standard class (i.e., some of your classes don't match the standard ones)? Any answer is legitimate, but knowing the purpose might help us with understanding the goals.
i'm just learning how to program, and so thought it would be a good idea to create that string class of my own, the only goal in doing this was to make me a bit familiar with c++.

Re:joining 2 strings

Posted: Mon May 09, 2005 3:21 pm
by Colonel Kernel
Schol-R-LEA wrote: Regarding the copy constructor, I disagree with CK here. You don't want to do the copying in the assignment method; rather, you will want to create an actual copy constructor (that is, a constructor that takes an object of the same class and creates a second duplicate object) and then create a friend assignment operator using that:
I think you misunderstood what my code does. There is a proper copy constructor, and it does create a copy of the given object. And the assignment operator does not duplicate this code -- it makes use of it via Meyers' so-called "swap trick".

Code: Select all

string::string(string& src)
{
 
<language_lawyer>
If the parameter isn't a const ref, then what you're declaring is not a copy constructor.
</language_lawyer>

<edit>
If you think I'm making an unnecessary copy in operator=(), go back and read the code more carefully. :)
</edit>

Re:joining 2 strings

Posted: Mon May 09, 2005 7:40 pm
by Schol-R-LEA
Ah, yes, I see it now. My bad.

Re:joining 2 strings

Posted: Tue May 10, 2005 11:30 am
by Joel (not logged in)
This brings up an important point: any concatenation operator that isn't passed an intermediate buffer to fill must dynamically generate the concatenated string as a new string object. This would have to be returned as a string pointer or a string reference, not a string per se.
Correct me if I'm wrong, but doesn't returning a string per se invoke temporary object semantics? Also, it should be noted that returning a pointer or reference is a little more complicated than simply returning a temporary string. In particular, the pointer or reference should never be referencing an object that is local to the concatentation operator function. And (IMHO) it's generally a good idea to avoid putting a burden of deleting a dynamically allocated object on a caller unless there is a significant advantage to doing so, and in this case I don't think that applies.

Re:joining 2 strings

Posted: Tue May 10, 2005 12:01 pm
by Colonel Kernel
@Joel: Exactly right.

Binary operators on types with value semantics pretty much universally return objects, not references or pointers to objects, for exactly those reasons. Now in C#, it's a completely different story, but that's for another thread...

IMO, this is the agony of strings in C++ -- way too many copies being made. If strings were immutable, it would be easier to optimize things like concatenation. As it is, there are a number of string optimizations in various implementations of STL including copy-on-write (COW) schemes, and the "small string optimization". COW gets pretty hairy in multithreaded programs, so I would stay away from it, at least in a class as fundamental as std::string.

AFAIK, the latest Dinkumware STL that ships with Visual Studio.NET 2003 has an implementation of std::string that uses the small string optimization. To summarize, each string object contains 16 bytes of data (not sure if this doubles in size for wstring). If the character sequence is small enough to fit in 16 bytes (including room for the terminating null), then it is "inlined" right in the object. If it won't fit, then the string contents are interpreted as a pointer to the dynamic array of chars (or wchar_t's, take your pick). Copying 16 bytes of data around when returning string objects is certainly no big deal compared to doing a deep copy every time.

Re:joining 2 strings

Posted: Mon May 23, 2005 11:40 am
by rich_m
void input()
{
// I don't think this does what you intend
// it to do. I can't test it ATM, but to
// me it looks like it's waiting for the
// user to type in an address to put in
// str.
cin>>str;
}
well is this how i should get in data

Code: Select all

void input()
   {
      char t[100];
      cin>>t;
      int len=strlen(t)+1;
      str=new char[len];
      strcpy(str,t);
   }
and btw i have added one more function in the class which can be used to get strings that are not meant to be viewed on the monitor such as passwords, i have used the 'int getch()' of conio.h so that the key pressed is'nt echoed. it can contain a maximum of seven characters and contain only letters(lower and upper) and numbers. this is how i've done it.

Code: Select all

void string::pwinput(char symbol)//symbol is the char i.e echoed.
   {
      char a[7];
      int enter=13;

      int k=0,i=0;
       while(k!=enter && i!=7)
       {
      if(i!=7)
         k=getch();

      if(valid(k) && k!=enter)
      {
         a[i]=k;
         cout<<symbol;
         i++;
      }
        }
        int len=i;
        str=new char[len+1];
        for(i=0;i<len;i++)
        {
      str[i]=a[i];
        }
        str[i]='\0';
    }

private:

int valid(char n)//CHECKS IF n IS A LETTER OR NUMBER
{
 int in=n;
 if ( (in>=97 && in<=122) || (in>=65 && in>=90) || (in>=48 && in<=57) )
   return 1;
 else
   return 0;
}
well it seems to work fine i just want to know if it 'is' fine?