Getting Rid of "char*"

Started by
38 comments, last by Brain 9 years, 3 months ago

Let's pretend it is an issue. How do you think you would solve it?

Actually, let's not pretend it's an issue, that's exactly what causes premature optimisation.

Here is the procedure to "solve it".

First prove it is an issue, via profiling. You've either done that and not posted results, or you haven't done that. Either way there is more work required at this step.

After that, your first attempt at a resolution should be to see how you can use strings less. E.g. if profiling showed that you had a bottleneck when comparing your string object IDs, then perhaps you could switch to using enumerations instead of strings. Or, if you're getting too many string copies, perhaps there are some critical places where the string is passed by value instead of const-reference.

The next thing to do is to look for a replacement that already exists. That may be in the form of a more-specialised part of the same library, or a whole other library. There is plenty that you can research.

Next, in considering whether to write your own replacement, you need to understand that in proceeding down the path of making your own, you are assuming that you can do a better job than the expert standard library writers. You are then assuming that you are in the top few percent of all C++ coders. Most people like to think that they are in the top few percent of C++ coders, but obviously most aren't. The experts are those who have first learnt to use existing tools very effectively (e.g. strings). One must learn why all the parts of the string class work as they do, in order to not fall into traps that others have already solved.

The final thing to understand is that even when there is a bottleneck, and even when you've done your best to optimise all related code as much as you can, there is always going to be a bottleneck somewhere. It's even possible that no matter what changes you make to a program, string handling could be such an unavoidable bottleneck (although that is generally unlikely, depending on what the program does).

You should probably read the thread, or at least my posts before writing such a long post telling me string is fine as is.

It was a proxy problem for a different problem I am having whose solution was similar.

EDIT: Good advice re: optimization however.

Advertisement

You should probably read the thread, or at least my posts before writing such a long post telling me string is fine as is.

It was a proxy problem for a different problem I am having whose solution was similar.

EDIT: Good advice re: optimization however.

I've read the thread and posted in it twice already I believe and here a few days later I'm still going to tell you that you're wasting time and trying to optimize something that you clearly have no idea to optimize and there are few options actually TO optimize without losing something else like wasted memory.

If you're actually having this problem I'm gonna go ahead and bluntly say that your code is wrong, you are doing something else wrong, you could be using ten thousand strings and it is almost completely unrealistic to assume that the string class is to blame for performance problems. Plenty of AAA games/engines get away with using dynamic strings and don't have a problem. In fact I've yet to see the engine that switched to making all their strings templated or used c-style char array strings for the sake of performance, it would make code extremely un-maintainable.

String is fine as is.

This has nothing to do with string or the STL in general, I have said that more than once in my recent posts.

Am I being trolled? lol

Either way I learned some stuff from this thread and worked my issue out.

Thanks to everyone that posted with advice!

Now you downvote my posts? After you have been so incredibly unhelpful?

I hope there is an ignore feature so I don't have to accidentally read one of your horrible posts again.

I think you need to pause for a minute and just accept that the answer given doesn't match your own opinion. I agree that in 99% of cases rewriting string is not what you need, and maybe your use case is in the 1% where you need to but even in this case I cannot see it personally. Let's not get hung up on calling each other trolls and getting into upvote/downvote wars over who is correct and who isn't because this can only end one way... With the heavy hand of mods descending upon us and setting it right their way. :)

Agreed.

I chose to present the problem as if I was rewriting the stl string class, because everyone is familiar with string. I never intended to actually rewrite the string class. I probably should never have chosen string to be a proxy for my issue in retrospect.

In general in C++ wrapping a fixed amount of memory of some kind, the amount of which is determined at compile time, into a class is a sort of new concept. Direct support for this was added to the stl in C++ 11 I believe though std::array (as was mentioned in this thread). (Although you could always have written a fixed allocator before that.) There isn't a perfect way to do it still unfortunately, as you have to pay the man one way or another. This thread has made it more clear what trade-offs you will have to make when representing a fixed array as a class, and a few different ways of going about it.

Classically, a fixed array would be manipulated using either an external editor class which you attach to the memory, functions which act similarly to the external class but you must supply the source memory with each function call, methods contained in the class which allocated it, or, more recently, using a template to wrap the memory itself into a class (which is the most object oriented way of going about it).

float[ Size ] textureMemory;

MemoryEditor editor;

editor.Attach( textureMemory );

editor.Edit( ... );

V.S.

float[ Size ] textureMemory;

EditFunction( textureMemory, ... );

V.S.

class Texture {

public:
... (memory editing methods go here)

private:

float[ Size ] textureMemory;

}

V.S.

TextureMemory< Size > textureMemory;

textureMemory.Edit( ... );

There really doesn't seem to be a right answer here that is always right, as with many things in programming. Several codebases that I am familiar with each take a different route, or mix them a bit. Each solution has tradeoffs, but if we understand them properly we can at least try and select the best tool for the problem. I think I understand them better now thanks to this thread.

Ok - I think I can see where you're coming from, to which I'll offer a short (ha) little explanation as to why I don't think what you're asking for is a good question:

You want to allocate a fixed amount of memory. Well, at some point that memory needs to be freed. This means that A) someone needs to know how much memory to free, and B) someone needs to know where to free the memory.

You can also either have a pointer to memory, or directly include the memory in the containing object or stack.

std::vector is an array that stores a pointer to array memory, and is the replacement for T* from C. This must always allocate off the heap, and std::vector remembers its size so it can deallocate in the destructor. Because the size is stored at runtime, the size of a std::vector is consistent, no matter how many elements it has. So users of std::vector don't have to care about the size.

std::array is an array that stores X elements of a type in-line with where it is defined, and is a replacement for T[size] from C. In order to create that elements however it must adjust its own size to the desired element count. Therefore everyone that uses it has to know the size of std::array, and therefore needs to know how many elements that array contains. If the array is on the stack, then the compiler needs to know how much stack space to allocate/deallocate, and if it is an object member then the compiler needs to know the size of the array so it can calculate the size of the containing object.

The only way around this limitation is to use a pointer to heap-allocated memory, which basically means you're back to using std::vector if you don't want to care about size. Sure, you could use an array-like interface to a dynamically allocated std::array, but why? Now you have another resource to manage and you've lost the advantage of using std::array in the first place.

As a side note - using std::array/std::vector is much more then just "more OO". It adds a bunch of very nice safety nets that C simply cannot provide. For example, it is much harder to pass the wrong size to a function since you can simply ask the array for its size, or use iterators instead (not to mention the very large number of standard algorithms that work with it already!). Also, the array can detect out-of-bounds access with ease, protecting you from the dreaded "undefined behavior" that comes from stomping on random memory locations.

Also, the array can detect out-of-bounds access with ease, protecting you from the dreaded "undefined behavior" that comes from stomping on random memory locations.

And to pre-empt another common and invalid complaint, std::array's and std::vector's automatic out-of-bounds checking is optional. They only do bounds checking when you do array.at(x). When using the subscript operator instead ( array[x] ), no bounds checking takes place, and it should be just as fast as regular arrays.

Also, the array can detect out-of-bounds access with ease, protecting you from the dreaded "undefined behavior" that comes from stomping on random memory locations.


And to pre-empt another common and invalid complaint, std::array's and std::vector's automatic out-of-bounds checking is optional. They only do bounds checking when you do array.at(x). When using the subscript operator instead ( array[x] ), no bounds checking takes place, and it should be just as fast as regular arrays.


Depending on your implementation, "array[x]" will do bounds checking in debug via asserts. But yes, the standard requires that "array[x]" have the same cost as with standard C arrays - which means no bounds checking in release.

Also, the array can detect out-of-bounds access with ease, protecting you from the dreaded "undefined behavior" that comes from stomping on random memory locations.


And to pre-empt another common and invalid complaint, std::array's and std::vector's automatic out-of-bounds checking is optional. They only do bounds checking when you do array.at(x). When using the subscript operator instead ( array[x] ), no bounds checking takes place, and it should be just as fast as regular arrays.


Depending on your implementation, "array[x]" will do bounds checking in debug via asserts. But yes, the standard requires that "array[x]" have the same cost as with standard C arrays - which means no bounds checking in release.

The visual C++ 2012 runtimes do this, i know for a fact. It is a useful save when you are debugging.

This topic is closed to new replies.

Advertisement