Link to home
Start Free TrialLog in
Avatar of Paul Maker
Paul MakerFlag for United Kingdom of Great Britain and Northern Ireland

asked on

char * and stringstreams etc, experts only please )

Hi all

I have been programming C++ and C for a while now and am used to using char *arrays for strings, i can do any thing i like with these and i am happy. never the less i keep reading on this site the this is bad if you are using C++ and stringstreams etc are better. why... i never have any problems with char *. what will using string streams offer me and will i still be able to do all my pointer stuff, for example i am currently writing a program to parse HTML pages and extract tags etc, is this possible with stringstreams etc,

are there some good tutorials etc for this, maybe a small example will be nice.....

i am getting more into C++ and moving away from C

Paul
Avatar of boneTKE
boneTKE

Paul,

Stroustrup comments on this in Chapter 20 of "The C++ Programming Language".

My take on it is this:  The C++ strings and stringstreams are "better" because a lot of the redundant and mundane programming is done for you, i.e. already encapsulated in the libraries.  With a lot of effort you can accomplish the same things with C, but you will spend forever writing code that allocates memory, check buffers for overflow, realloc memory, mimic the C++ operators overloads, make C code properly obey const, etc.  Furthermore, because this stuff is mundane, it often is forgotten or done sloppily, leading to bugs.

Using C++ strings and stringstreams gives me more time to do bigger and better things.

Good Luck,

boneTKE
The most important improvement (IMHO) is that the C++ method is type safe.  Rather than using %d, you use the object itself.  So the compiler can tell you if you've made a mistake, rather than failing at run time.  I've read nietod opine on this subject a few times, perhaps you could search PAQs?
>> used to using char *arrays for strings, i can
>> do any thing i like with these and i am happy
Ignorance is bliss.  :-0   Sorry about that. :-)

>> i never have any problems with char
Never?  I bet you do.  I bet that you've had plenty of bugs with them and have fixed all the bugs you;ve found.  I ben you've accidentally overflowed character arrays by storing strings that are too long for them, right?  I bet you've mistakely overwritten a NUL termiantor and produices a string that had garbage at the end?  I bet you mistakely used operator = or ==, < > >= etc on character arrays and cahracter pointers.  right?  I bet you have written procedures that change the string that they work on, rather than returning a new string.  And then forgotten this fact and had a string changed by mistake.   I bey you;ve allcoated character arrays dynamically and forgot to delete them.  rigth?

Have you not done any of those things?  nave you not done them all?   You can't make those mistakes with string objects.  You woudl never have had to waste the time tracking down those mistakes.  furthermore, can you be sure that there are not occurances fo those problems still lurking in your code, just not yet detected?   In C, the missuses of C-stle strigns is believed to be the leading cause of crashes and bugs.   I bet you are not immune.

continues
But beyond bugs, there are other advantages to string objects.  They are much more expressive.  Isnt

string s1;
string s2;

s1 = s2;

clearer than strcpy().  Isn't,

if (s1 > s2)

clearer than strcmp().  isn't it easier to write and use fucntioms that can return new strings by value rather than altering the string passed to them.  Like

string MakeUpper(cosnt char &string);

Isn't

string s3 = s1 + s2;

clearer than strcat()?  (And you never need to worry about the lengths of these strings!)

continues
Another advantage is that string objects are first-class types.  Arrays are not first class types.  They cannot be copied, cannot be assigned, cannot be compared etc (They can be using functions, but not using the same natural syntax of the language that applies to built-in types, like int.)

Because they are not first-class types, they often cannot be used in template classes, at least not ones that assume the types they work on are first-class types, and that is a common assumption.  So for example an STL list<char *> works pporly at best.  it works, but not usually as is desired.  but a list<string> works great.  This is true of most the the STL ctempaltes as well as ones others including yourself might write.

continues
Avatar of Paul Maker

ASKER

>>Never?  I bet you do.  I bet that you've had plenty of bugs with them and have fixed all the bugs you;ve
found.  I ben you've accidentally overflowed character arrays by storing strings that are too long for
them, right?  I bet you've mistakely overwritten a NUL termiantor and produices a string that had garbage
at the end?  I bet you mistakely used operator = or ==, < > >= etc on character arrays and cahracter
pointers.  right?  I bet you have written procedures that change the string that they work on, rather
than returning a new string.  And then forgotten this fact and had a string changed by mistake.   I
bey you;ve allcoated character arrays dynamically and forgot to delete them.  rigth?

hahahah, okay i have had these problems, not recently cos i sont make them any more, but true , i used too, all the time, my favourite was damage after normal block when freeing char *
so then, i can think of a string as an automatic variable, what i mean is

void function()
{
string s;
s = "SSSSSSSSSSSSSSSSSSSSSSSSSSSSSS";
s = s + "WWWWWWWWWWWWWWWWWWWWW";
}

and the memory will be freed of the heap when the variable falls out of scope?

also how do i frig with the actual chars in the string, i memtioned i do alot of parsing to extact bits of the string etc. what support is there for this

you will have to excuse my ignorance on this but i have just never bothered looking into this
Finally there is efficiency.  On average string objects tend to be faster to use than C-strings--at least for performing the same operations.  its not uncommon to see examples where a C-string is 100 times faster than a string object, simply because the programmer choose the fastest way to do the action with the C-stirng and the slowest with the C++ stream.

String objects store their length, so they can report their length in a constant time, inlike strlen() which must perform a linear search.  (This fact makes other things more efficient too, like concatenating strings).

String objects tend to use reference counting or other optimizations (including one I published--I like to say that) to make copy and assingment operations fast.  This allows you to pass string objects to functions and return them from functions and to copy them with operator = with great speed, but strcpy() depends on the length of the string and can be very slow for long strings.

All in all, string objects provieds all the features of C-stirngs, but with much much more safety and less programmer effort, they act as first-class types allowing them to be used more naturally and to be used interchangable with other types in a template, they tend to be very efficient, possibley more efficient than C strings, if used correctly.

Now an argument for using C strings instead of string objects.




Any questions?
surley the actual object itself carries more overhead than a C string?, but i spose this is ofset by the improved effieceny as you described, i.e. getting strlen() against just looking at a member variable containg the length

also so i can so all things like strstr for finding sub strings, etc etc
where can i get more info on the methods etc that operate on strings
ASKER CERTIFIED SOLUTION
Avatar of nietod
nietod

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
i am sold on it, the next C++ program i have to write will use string and other STL classes that make my life easier, cool

cheers nietod.
>> surley the actual object itself carries more
>> overhead than a C string?,
Yes, because it tends to use dynamic memory allocation.  Thus creating a string is a constly operation compared to creating a static or local character array.  But its rare that you just create a string (of any type) and leave it unchanged.  When it comes to manipulating strings the string class has an advantage (often) and this can sometimes make it faster than the C string.  Not always, but often.  In either case, the difference is not likely to be noticed except in tight loops.  so for example

for( int i = 0; i < 100000; ++i
{
  string s1 = "abc";
  // char s1[] = "abc"
  cout << s1;
}

here the character array is likely to be much faster.  But overall the difference is not likely to be significant and with good planning you cna give the string object an advantage over the C string.

>> where can i get more info on the methods etc
>> hat operate on strings
I like "The C++ Standard Library" by josuttis

It deals with FAR more than just strings.

If you have VC, it has decent help on strings.  Note that you need to look up basic_string, not "string"  The STL string class is a template class.  This allows it to store many different types of data.  The "string" class is a specialization for storing char.


Here's some code you can used to work with std::string.

template<typename T>
T* MakeCopy(T *First,T *End, T *Dest)
{
     std::copy(First,End,Dest);
     return Dest;
}

template<typename T>
T* MakeReverse(T *First,T *End)
{
     std::reverse(First,End);
     return First;
}

template<typename T>
T* MakeUpper(T *First,T *End)
{
     std::transform(First,End,First,toupper);
     return First;
}

template<typename T>
T* MakeLower(T *First,T *End)
{
     std::transform(First,End,First,tolower);
     return First;
}


template<typename T>
T* DoRemove(T *First,T *End, T Target)
{
     std::vector<T>::iterator new_end = std::remove(First,End,Target);
     if (new_end != End)
     {
          *new_end = 0;
     }
     return First;
}

template<typename T>
T* DoReplace(T *First,T *End, T Old, T New)
{
     std::replace(First,End,Old,New);
     return First;
}

template<typename T>
T* DoRemove(T *Start1,T *End1,T *Start2,T *End2)
{
     T* i;
     while((i=std::search(Start1,End1,Start2,End2)) != End1)
     {
          std::copy(i + (End2-Start2),End1+1,i);
     }
     return Start1;
}


template<typename T>
std::string GetDataInBetweenTags(T *First,T *Last, const std::string &StartTag, const std::string &EndTag)
{
     T* i = std::search(First,Last,StartTag.begin(),StartTag.end());
     if (i == Last) return "";
     std::string data = i + StartTag.size();
     i = std::search(data.begin(),data.end(),EndTag.begin(),EndTag.end());
     if (i == data.end()) return "";
     *i=0;
     return data;
}

template<typename T>
size_t ArrayLen(T *First, size_t MaxLen = 16777216)//NullTerminatedArrayLen
{
     size_t Qty;
     for (Qty = 0;*First != 0;Qty++,First++)
     {
          if (Qty >= MaxLen) break;
     }
     return Qty;
}

std::string WideToChar(const std::wstring &src, std::string &dest)
{
     dest.resize(src.size(),' ');
     std::copy(src.begin(),src.end(),dest.begin());
     return dest;
}

int main(int argc, char* argv[])
{
     std::string str = "bla bla bla<p>123.567.101.112</p>";
     std::string StartTag = "<p>";
     std::string EndTag = "</p>";
     std::string IP_Num1 = GetDataInBetweenTags(str.begin(),str.end(),StartTag,EndTag);
     char html_stuff[] = "bla bla bla<p>169.189.99.69</p>Hello World";
     std::string IP_Num2 = GetDataInBetweenTags(html_stuff,html_stuff+ArrayLen(html_stuff),StartTag,EndTag);

     size_t testlen = ArrayLen(IP_Num2.begin());
     testlen = ArrayLen(html_stuff);

     char data[] = "This is a test to see how this works";
     char dest[128] = "";
     std::cout << "Original =>" << data << std::endl;
     //Algorithm
     MakeUpper(data,data+ArrayLen(data));
     std::cout << "MakeUpper =>" << data << std::endl;

     std::wstring testwidestr = L"Hello World";
     MakeUpper(testwidestr.begin(),testwidestr.end());
     std::string tmpstr = "";
     std::cout << "MakeUpper (wide)=>" << WideToChar(testwidestr,tmpstr) << std::endl;

     MakeCopy(data,data+ArrayLen(data), dest);
     std::cout << "MakeCopy =>" << dest << std::endl;

     MakeLower(data,data+ArrayLen(data));
     std::cout << "MakeLower " << data << std::endl;

     MakeReverse(data,data+ArrayLen(data));
     std::cout << "MakeReverse =>" << data << std::endl;
     
     DoRemove(dest,dest+ArrayLen(dest),'S');
     std::cout << "DoRemove =>" << dest << std::endl;

     DoRemove(dest,dest+ArrayLen(dest),'Z');
     std::cout << "DoRemove =>" << dest << std::endl;

     DoReplace(dest,dest+ArrayLen(dest),'T','Y');
     std::cout << "DoReplace =>" << dest << std::endl;

     char datax[] = "This is a test to see how this works";
     char search_datax[] = "how";
     DoRemove(datax,datax+ArrayLen(datax),search_datax,search_datax+ArrayLen(search_datax));
     std::cout << "DoRemove =>" << datax << std::endl;

     return 0;
}
The above template functions will work with string, wstring, and char* strings.
They should get you started in the right direction, and hopefully make your transition from char* to std::string a little smoother.
well thanx axter, did you pay points to see this q, if so mighty nice of you :), its just you were not already on the thread.

i have not the time to look through the code right now as i am just about to go out, never the less i will definatly post some points up for you when i have a minute (tomorrow).

i need to migrate myself over to string etc so all the help i get is great

thanx axter and thanx again

you an neitod are good experts
>>well thanx axter, did you pay points to see this q, if
>>so mighty nice of you :),
No, I don't have to pay points to see PAQ's.  I have EE KnowledgePro for Free.
Experts can get this for free if they get so many points a month.
just to comment, i have just started my firsty comercial app using strings insread of char arrays, wow, what a time saver.
sweet

thanx guys