Sign in to follow this  

Why would you program your own string class?(C++)

This topic is 4565 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Looking through the source of several open-source projects like Crazy-Eddies-GUI, Irrlicht etc. I have seen that most of these projects have their own string class. Why does they have this? I think it would be very hard to beat the STL one in performance. So why does they do it? Because some compilers(VC6 for example) doesn't have a good STL? or because they think they can do it better?

Share this post


Link to post
Share on other sites
Good question! It seems like the "thing to do" on these boards is to reinvent the wheel and try to rewrite STL containers etc! And not for educational purposes either, that might actually be legitimate. Instead, people feel like they can come up with something better/faster, but what the write is of course non-standard so other people will have to learn how to use their damn code, rather than just look at it and immediately know how to use it.

Yay!

Share this post


Link to post
Share on other sites
Hi

I think a lot of projects use older base code to work with and porting old code to use a new string class is a lot of work and actually of no use in most cases


also a lot of people aren t used to STL although you can consider it standard nowerdays


another thing to mention, you can replace the STL coming with the vc++ compiler
or just use the gcc compiler Mingw for windows


i personally stick with STL because i set on reuseability

Share this post


Link to post
Share on other sites
After learning a bit about standard string, I actually was suprised by its lack of functionality. stringstream allows for more functionality, but has its own series of quirks and limitations.

Add to that the fact that many libraries require [non-const] char arrays, it's a pain at times to use standard strings properly. So I can see where people might be tempted to rewrite a string class.

I imagine the original reasons were not so noble though.

Share this post


Link to post
Share on other sites
The C++ string class lacks a lot of the niceties of strings in other languages, such as Python or Java. So it might be beneficial to write your own to overcome that. I have a project that uses my own string class, but I did implement it in terms of std::string.

Share this post


Link to post
Share on other sites
Quote:
Original post by Telastyn
Add to that the fact that many libraries require [non-const] char arrays, it's a pain at times to use standard strings properly. So I can see where people might be tempted to rewrite a string class.


If its a C++ library and it requires char * as strings instead of taking std::string & I'd consider the library broken.
If its C library or an interface to another language I'd just write a small amount of glue code to get around the problem.

Share this post


Link to post
Share on other sites
Quote:
If its a C++ library and it requires char * as strings instead of taking std::string & I'd consider the library broken.


While I appreciate the point in context, I have to take issue with that. One of the tenets of C++ is 'only pay for what you use'. Why should an interface force someone to use a class which requires dynamic allocations? What if the user wants to pass in a string in a string table? How about a string literal? You may also have to deal with different allocators, multithreading, etc. What someone wants to use your library on a platform without a std::string implementation, or with a broken/different one?

I use std::strings extensively, but like any other piece of code, there are tradeoffs to using them. Lets not let use of std::strings become dogma. There are very legitimate reasons to both use them and to avoid them, particularly in interfaces.

Share this post


Link to post
Share on other sites
It's nice to have any feature you can think up in your system. It's nice to be in total control. String classes are very simple, though.

What about linked lists? Does anyone actually use the STL implementation? From what I understand of it, it can be extremely inefficient when working with big lists. Something about trying to move from node to node without an iterator.

Share this post


Link to post
Share on other sites
Quote:
Original post by Jiia
It's nice to have any feature you can think up in your system. It's nice to be in total control. String classes are very simple, though.

What about linked lists? Does anyone actually use the STL implementation? From what I understand of it, it can be extremely inefficient when working with big lists. Something about trying to move from node to node without an iterator.


Despite the fact that linked lists -are- the main place where I've "re-invented the wheel", I still use std::list in places. It's better at doing things within the context of the data it contains.

And the use of homegrown linked lists is the classic example of people re-inventing the wheel for no good reason. In the majority of cases, a linked list is one of the worst container classes the programmer could've chosen.

Share this post


Link to post
Share on other sites
Quote:
Original post by BrianL
While I appreciate the point in context, I have to take issue with that. One of the tenets of C++ is 'only pay for what you use'. Why should an interface force someone to use a class which requires dynamic allocations? What if the user wants to pass in a string in a string table? How about a string literal? You may also have to deal with different allocators, multithreading, etc. What someone wants to use your library on a platform without a std::string implementation, or with a broken/different one?

I use std::strings extensively, but like any other piece of code, there are tradeoffs to using them. Lets not let use of std::strings become dogma. There are very legitimate reasons to both use them and to avoid them, particularly in interfaces.


True, however the fact its a non-const char * implies that the string is going to be changed (and probably require reallocation of memory 'somewhere'), this brings up a whole host of issues ranging from when and where the memory is allocate to possible ownership issues.
So, while there might well be locations where a char * is required I'd argue that in the majortiy of cases a std::string & will serve you alot better.

Its all very well saying 'only pay for what you need', yet there is very very little difference in cost between a char * and a std::string &, both require dyamnic allocation 'somewhere', the difference between them is that std::string is alot safer.

As for your examples;
- a string table I'd probably impliment in terms of std::strings anyways
- string literals arent related to this really as char * implies memory which can be changed, literals are char const * const (or something like that) and thus shouldnt be reallocated or changed anyways.
- allocators might be a point, in that case you can fall back to std::strings parent class
- MT again might be a point but alot of libraries make no promises about thread safety anyways
- A platform without std::string is probably going to be too old to worry about and not C++, thus this is a non-problem. Plus you can install the std. lib. on old C++ compiles anyways (STLPort for example) and they shouldnt be 'different' as std::string is standard as is its interface, any changes make the platform broken (unless the standard changes) and thus not a library problem as it will be designed to conform with standard C++, not some broken system.

edit:
also, by using a char * the library in question is forcing me to adapt my program to its useage (while being unsafe at the same time) I'd consider that less acceptable than the library having an interface which uses safe and modern C++ constructs.

Also, on the topic of reinventing the wheel, the worse bit of all of these 'string' classes is that they probably arent compatible with each other, so you have to convert left, right and center to work with them if they are required for different libraries, where as if everyone had stuck with std::string all would have been good...

Share this post


Link to post
Share on other sites
Apart from these work efficiency reasons there is one major reason to use "home-grown" implementations of the things that already exist. It's fun. It will take more time. It probably won't be nearly as fast as the standardized implementation. Still, it's a lot of fun writing code that works well. Much more fun that to use what has been written by someone else. That's all there is to it. The fun in programming.

Share this post


Link to post
Share on other sites
true, but then if you didnt spend all that time on the libraries you could have fun making something new and impressive like a game instead of getting burnt out trying to do better than the standard lib. (beyond learning that is)

Share this post


Link to post
Share on other sites
Quote:
Original post by staaf
... That's all there is to it. The fun in programming.


I absolutely agree.
And however.. of course people who worked at the stdlib have *A LOT* of experience, but for some simple string class usage, algorithms just remain the same: reallocation, concatenation, indexing, sub-string search.
For example, if you have to add contents to a string there are two cases:
1) string pre-allocated memory is enough
2) request some other memory
std:string just does theese ones. Coding them by ourselves won't let to worse performances (if we stay on standard memory allocation).
Anyway unfortunately i was asked to code on platforms (mobile) which at that time didn't supported stl library so i had to "reinvent the wheel".
What i'd wanted to say is that, even if stl is a DAMN FAST template library, it dowsn't mean you can't do >= in performances.

Not a polemic reply, just my opinion.

Marco.

Share this post


Link to post
Share on other sites
Quote:
Original post by Telastyn
I still use std::list in places. It's better at doing things within the context of the data it contains.
...
In the majority of cases, a linked list is one of the worst container classes the programmer could've chosen.

If you use STL type lists, I can understand why you feel this way. Otherwise, there are plenty of good reasons to use linked lists. And I can only think of one good reason that they would be a bad choice; random access.

Lets take an item in a store for example. Character buys item, item travels store->character_inventory list. Character equips item, item travels character_inventory->hands list. Character tosses item, item travels hand->map list. All of these actions can be instantaneous with normal linked lists. STL lists requiring iterators do not function properly in these scenarios. You would have to scan each list to find the item. Not a big deal for this scenario, but it can definitely become a big deal.

Share this post


Link to post
Share on other sites
Quote:
All of these actions can be instantaneous with normal linked lists. STL lists requiring iterators do not function properly in these scenarios.


WTF are you talking about? For all intents and purposes, syntax excepted, manipulating node pointers is equivalent to using iterators, without the safety guarantees.

You know you can store iterators in variables, right? The operations that invalidate a list iterator are, unsurprisingly, the same that would code a node pointer to 'go bad'.

Share this post


Link to post
Share on other sites
Quote:
Original post by Fruny
WTF are you talking about? For all intents and purposes, syntax excepted, manipulating node pointers is equivalent to using iterators, without the safety guarantees.

You know you can store iterators in variables, right? The operations that invalidate a list iterator are, unsurprisingly, the same that would code a node pointer to 'go bad'.


Heh, actually I don't. My first post pointed out what I thought was a problem with STL lists, and no one rejected it. So I assumed it was correct. To be honest, I've never actually used them.

So you only have to store the iterator, and not both the iterator and the object pointer? And an iterator can only go bad by deleting the object's memory? I didn't know that. Maybe STL lists are better than I thought.

edit: What about the case where an object may reference a node, and the node changes lists? The iterator is still valid?

Share this post


Link to post
Share on other sites
Ogre::String is just a typedef for std::string (itself a typedef for std::basic_string<char>) and if I'm not mistaken, CEGUI's string class is the same. Typedefing string in like that just makes it easier to type (String instead of std::string) while avoiding namespace pollution.

Share this post


Link to post
Share on other sites
Quote:
Original post by Jiia
So you only have to store the iterator, and not both the iterator and the object pointer?


Some operations require manipulation of the data structure as a whole and thus is done with a list pointer. Sorting is an example (there is a list::sort member function). Most 'common' operations work with element ranges and can be done with iterators (look at the whole <algorithm> header!). Accessing elements is precisely an iterator's job. It would help you understand things if you tried to view an iterator as just a fancy pointer. Just like you can use pointers to manipulate data in arrays, you use iterators to manipulate data in lists, vectors, maps ... (in fact, conceptually, pointers are C-array iterators).

Quote:
And an iterator can only go bad by deleting the object's memory?


Off the top of my head, I don't think there are any operations, beside erasure, that invalidate a list iterator. When you so remove an element from a container, the iterator that pointed to it obviously becomes invalid.

Quote:
Maybe STL lists are better than I thought.


And you were there decrying it out of pure ignorance. [rolleyes] Many people think the STL is bad just because they've heard it was, and carry on spreading such a belief to new programmers... thus the cycle repeats.

Ok, I'll tackle your argument:

Quote:
What about linked lists? Does anyone actually use the STL implementation?


I do. Many other C++ programmers do. Heck, any C++ programmer should - why write a custom list class when you have a standard one? Imagine having to deal with each individual programmer idiosyncratic list code...

Quote:
From what I understand of it, it can be extremely inefficient when working with big lists.


You don't give library writers enough credit. There are commercial standard library implementations, including the one MS licenses for Visual Studio. Do you think they could get away with providing "extremely inefficient" code? They'd go out of business before the year ends.

You've got to realize that a STL list is a perfectly ordinary doubly-linked list. If a high-school student can write one, you can hope that a professional library writer can do a better job of it. It's not like they can really botch it anyway - not and pass their hiring interview (at least I hope).

Quote:
Something about trying to move from node to node without an iterator.


And with your custom implementation, how would you go from node to node without a node pointer? Yes, restarting from the beginning of the list every time under the pretense of providing random access is a common noob error.

Quote:
edit: What about the case where an object may reference a node, and the node changes lists? The iterator is still valid?


Depends on how "the node changes lists". Have a look at std::list::splice(). Check your documentation to see if it invalidates iterators. Personally, I doubt it does, that would be dumb.

Share this post


Link to post
Share on other sites
Quote:
Original post by _the_phantom_
true, but then if you didnt spend all that time on the libraries you could have fun making something new and impressive like a game instead of getting burnt out trying to do better than the standard lib. (beyond learning that is)

Probably, yes. That is if my sole and ultimate goal was to produce some result, which it isn't. I would be a first rank idiot heading into game programming if I didn't love the programming part. It wouldn't be worth it if I thought of programming as something that burns you out if you don't get it done swiftly enough. I do game programming only because it keeps challenging me with all these different programming problems to be solved. I use standard librarys only when the problem at hand is far beyond my limits.

Share this post


Link to post
Share on other sites
First of all, don't view this as an attack on STL lists. I have no place to do so, considering I obviously don't even know how they work.

Quote:
When you so remove an element from a container, the iterator that pointed to it obviously becomes invalid.

If this is true and I understand it correctly, this is not the case with a custom linked list. You can reference a node from outside sources, move that node from list to list to list, and that reference is still valid. Only killing off the actual object invalidates the reference.

Quote:
I do. Many other C++ programmers do. Heck, any C++ programmer should - why write a custom list class when you have a standard one? Imagine having to deal with each individual programmer idiosyncratic list code...

Actually, it's a pretty simple setup. Two links to get from node to node, two list-holder links. Not much to get confused about.

Quote:
You don't give library writers enough credit. ... Do you think they could get away with providing "extremely inefficient" code?

I was looking at it from more of an angle of non-specific. Writing standard code that should work for everything is much more difficult, and sometimes not as efficient, as writing something to work for what you need.

Quote:
And with your custom implementation, how would you go from node to node without a node pointer? Yes, restarting from the beginning of the list every time under the pretense of providing random access is a common noob error.

What I expected was that STL used it's own encapsulation class/structure to hold a pointer to the actual object. The next/previous links would also be stored in the structure. To move through the list, you need a reference to the structure, not the object. What about in a situation where the object itself wants to access the next object in it's list? Just an example.

Share this post


Link to post
Share on other sites
Quote:
Original post by Jiia
First of all, don't view this as an attack on STL lists. I have no place to do so, considering I obviously don't even know how they work.

Quote:
When you so remove an element from a container, the iterator that pointed to it obviously becomes invalid.

If this is true and I understand it correctly, this is not the case with a custom linked list. You can reference a node from outside sources, move that node from list to list to list, and that reference is still valid. Only killing off the actual object invalidates the reference.

You're also exposing the internals of the data structure to the programmer, which is generally a big no no. This is a distinct advantage iterators have over a home grown linked list. It's a generic interface provided by all of the standard containers.
Quote:

Quote:
I do. Many other C++ programmers do. Heck, any C++ programmer should - why write a custom list class when you have a standard one? Imagine having to deal with each individual programmer idiosyncratic list code...

Actually, it's a pretty simple setup. Two links to get from node to node, two list-holder links. Not much to get confused about.

Have two programmers write you a linked list and compare them. Now have them write the set of standard algorithms that can be applied to std::list. Don't forget your adapters, allocators, and traits. Now provide an interface that is as clean as iterators are. There is a reason most C++ programmers use the standard library containers in the enterprise world.
Quote:

Quote:
You don't give library writers enough credit. ... Do you think they could get away with providing "extremely inefficient" code?

I was looking at it from more of an angle of non-specific. Writing standard code that should work for everything is much more difficult, and sometimes not as efficient, as writing something to work for what you need.

Library writers typically know more about the compiler they are targetting than any third party programmer will. They work closely with the compiler vendors to ensure that their code is optimal to the best of their abilities. It would be very hard for someone to do the same without specializing the container to such a degree as to make it useless except for a particular problem. Not to mention, such premature optimization is discouraged.
Quote:

Quote:
And with your custom implementation, how would you go from node to node without a node pointer? Yes, restarting from the beginning of the list every time under the pretense of providing random access is a common noob error.

What I expected was that STL used it's own encapsulation class/structure to hold a pointer to the actual object. The next/previous links would also be stored in the structure. To move through the list, you need a reference to the structure, not the object. What about in a situation where the object itself wants to access the next object in it's list? Just an example.

If the object wants to access the next object in the list, then it will have to do it the same way an object stored in your home grown list will. It will have to iterate over the list to the next object. Unless you're doing something insane like making the linked list node be the object as well.

I would recommend getting a good book on the standard library. It would be very educational, and will give you yet another tool that you can use without duplicating existing code and functionality.

Share this post


Link to post
Share on other sites
Quote:
Original post by Washu
You're also exposing the internals of the data structure to the programmer, which is generally a big no no. This is a distinct advantage iterators have over a home grown linked list. It's a generic interface provided by all of the standard containers.

Not necessarily. The links can be protected like any other member of a class. List::Insert, List::Remove, List::GetFirst, List::GetLast, Node::GetNext, Node::GetPrevious. Even the object which exists as a node can be protected from modifying it's own list attributes.

Quote:
Library writers typically know more about the compiler they are targetting than any third party programmer will. They work closely with the compiler vendors to ensure that their code is optimal to the best of their abilities. It would be very hard for someone to do the same without specializing the container to such a degree as to make it useless except for a particular problem. Not to mention, such premature optimization is discouraged.

I don't really understand what this has to do with the compiler or optimization. It's not about being optimal, it's about the algorithm not fitting the purpose. I'll try to give another example. If I have nodes which exist in several lists, what if I want to iterate through one list by using a node from another? I have no clue where that node may exist in the other lists. A custom list node can hold enough information to navigate on it's own. From what I understand, STL cannot do this without searching all of the other lists. It's not unoptimzied code, it's a huge waste of processing.

Quote:
If the object wants to access the next object in the list, then it will have to do it the same way an object stored in your home grown list will. It will have to iterate over the list to the next object. Unless you're doing something insane like making the linked list node be the object as well.

Of course I'm doing something insane. I'm not completely rational. Many of my objects are born and raised to be list nodes. Of course you could always do a trick of deriving a templated class from the object so that any type of object can be used, but I've never found myself doing this.

Quote:
I would recommend getting a good book on the standard library. It would be very educational, and will give you yet another tool that you can use without duplicating existing code and functionality.

I have the MSDN library. Could probably find just about all I need to find in there. There are also good reference documents on the internet. I wouldn't recommend spending money on a book.

Share this post


Link to post
Share on other sites
Quote:
Original post by Jiia
You can reference a node from outside sources, move that node from list to list to list, and that reference is still valid.


What is the significance of a node that does not belong to a list anymore?
If you intend it do be treated as a one-element list, then what you'll do is create an empty list and splice the element into it. If you mean that you can just keep your node around, with invalid previous/next pointers then I'm afraid I'll just have to stab you in the arm with a rusty pitchfork, for that is an abomination: extract the data, don't keep the node. Inefficient? Doesn't have to be - depends on how the data was put into the node in the first place (value? pointer? smart-pointer?)

Quote:
Actually, it's a pretty simple setup. Two links to get from node to node, two list-holder links. Not much to get confused about.


Yes, I know that a linked list is simple internally, it's dealing with the interfaces that is hell.

Quote:
I was looking at it from more of an angle of non-specific. Writing standard code that should work for everything is much more difficult, and sometimes not as efficient, as writing something to work for what you need.


This is correct, though for the standard containers, the only place where you could really do things differently in your custom implementation is for memory allocation, and the STL gives you a customization hook in the form of the Allocator parameter. Otherwise, as we've mentioned before, there aren't zillions of ways a linked list or a dynamic array (vector) can be implemented internally (at least, in structurally significant ways).

Quote:
What I expected was that STL used it's own encapsulation class/structure to hold a pointer to the actual object. The next/previous links would also be stored in the structure.


Whether a list iterator is the actual node type that makes up the linked list, or is a wrapper around a node pointer is an implementation detail which shouldn't affect how they work. As Washu mentioned, you can expect implementers to have picked whatever works best for the compiler they have in mind.

Quote:
To move through the list, you need a reference to the structure, not the object.


Yes, that's what the iterator is.

Quote:
What about in a situation where the object itself wants to access the next object in it's list? Just an example.


Unless you assume the object knows it is in a list and thus is part of a node object (i.e. you have an "intrusive list"), you can't, without storing the iterator, or node pointer, or whatever.

Share this post


Link to post
Share on other sites

This topic is 4565 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this