Sign in to follow this  

Memory Allocator for string class

This topic is 3848 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hello everybody . I'm actually trying to find a way reduce the amount of call to new/delete in my simple string class. Actually, every time a string is constucted it allocate memory space in order to store the char array, which in final drive to a lot of small memory allocation... I wonder if anybody of you know a kind of memory allocator, that can help me to make less memory allocation ( pre allocating a string buffer ... ?? ..., using a small object memory allocator ... ??? ...) If you have any idea... Thanks !

Share this post


Link to post
Share on other sites
The first problem is about calling new and delete every time i need to use a string ...

I would like to use something faster ...

And the second reason, is about memory fragmentation with all that small allocation ...

I'm making a 3d engine, and i use this class pretty intensivly ...

You think it's not a problem at all to use this class without efficient memory management ??


Share this post


Link to post
Share on other sites
What language? The new/delete part seems to suggest C++, but if you really use C++ then you should just use std::string unless you have a good reason not to (you can't write anything faster).

There are plenty of things you can do. For small strings you could have a regular character array and when a string's length exceed some fixed number you start using dynamic memory allocation (done by several std::string implementations).

You might also allocate large chunks at a time. For example you might have a string with 200 elements and you don't have room for any more. Then you add a character to the end, but instead of freeing the 200 character elements and allocating 201 character elements you could simply allocate 400 characters to avoid many small allocations. Then when you exceed 400 characters you could allocate 800 characters. A flexible string class would allow the user to specify an optional policy both for when to allocate more memory and how much more memory.

If you delete a string of 100 characters then you might choose to throw it on a list. Then the next time you create a string less than 101 characters you just take that object and use it. Of course when multiple elements are on the list you should choose the smallest one with a larger size than is required.

Copy-on-write might also reduce some memory allocations, but in practice it can often be too slow. Especially with multithreading.

For some applications using ropes for strings might even boost your performance.

Quote:
using a small object memory allocator

How would that help? I guess your string objects are likely to differ in size and you want the content to be laid out in a sequential fashion.

Share this post


Link to post
Share on other sites
Quote:
Original post by ClementLuminy
You think it's not a problem at all to use this class without efficient memory management ??


Fragmentation is always a problem and unless you intend to defragment your memory (might actually be possible while paused or on a loading screen) then you're going to run into trouble when someone plays your game for a long time.

Share this post


Link to post
Share on other sites
Quote:
Original post by CTar
Quote:
using a small object memory allocator

How would that help? I guess your string objects are likely to differ in size and you want the content to be laid out in a sequential fashion.


I have had some success using small object allocators for strings. I have a few allocators for different sizes, 32, 128, 512 and 2048 bytes, and allocate from the smallest pool possible or new them if they are larger than the biggest pool.

However, as CTar is getting at, is this *really* a problem? Have you profiled and found string alloation to be taking up an unreasonable amount of your time?

You talk about fragmentation, but on Windows at least you can usually rely on virtual memory to deal with it for you. In the cases where you can't, design a system that solved your specific problem (ie, why are you burning through so many temporary string allocations?).

Alan

Share this post


Link to post
Share on other sites
in fact i'm developping this engine as a hobby, so i do not have any other platform than windows to test my system ( i know ... it sound funny :) )

BUT, i want to develop my system has if was able to run it on consoles ...
Which simply mean that i try to take in account every problem that can occur on those platforms .... ( i.e fragmentation )

And also for those reasons ( and also mainly for learning purpose ), i do not use any kind of stl stuff including their string class ....

Can you just tell me a little more about your small object memory allocators ??
Do you have a link to a page that explain how does it works ???
Is it efficient ???
And an other thing :
By using this allocator, does the problem of fragmentation just move from system memory to custom allocator memory ???

Thanks a lot...

And again, it's for performances AND for learning purposes that i'm doing that ...


:)

Thanks !

Share this post


Link to post
Share on other sites
Quote:
Original post by ClementLuminy
And also for those reasons ( and also mainly for learning purpose ), i do not use any kind of stl stuff including their string class ....

That seems like a bad idea. You aren't likely to encounter significant fragmentation if you use the standard library well.

Doing something for learning purpose is a bad idea in a big project. If you want to learn how to create data structures do it in a seperate project because if you do it as a part of a bigger project you will not learn as much and the project as a whole will suffer from this.

Quote:
Can you just tell me a little more about your small object memory allocators ??
Do you have a link to a page that explain how does it works ???
Is it efficient ???

He explained it in his post. For a 42 character string you'd just take a 128 byte block and use because it's the smallest available which can contain 42 characters. When you stop using strings you just let go of the block and let someone take it.

This can result in a little internal fragmentation because you might have some blocks, but not the kind you'd like to use, but there are ways to get around that (the buddy memory allocation algorithm is a well-known efficient one, but still suffers from some internal fragmentation).

Share this post


Link to post
Share on other sites
Quote:

BUT, i want to develop my system has if was able to run it on consoles ...
Which simply mean that i try to take in account every problem that can occur on those platforms .... ( i.e fragmentation )

This is an excellent way to waste your time. Until you've deployed the system on a target platform, you don't know what the systems problems will be on that target platform. You should not try to optimize for a situation you have not encountered yet; this is premature optimization. Furthermore, as others have suggested, it might be micro-optimization as well -- if a profiler told me I was allocating a lot of strings, I'd spend my time trying to reduce the need to allocate strings, not trying to make the allocations themselves faster. The fastest code is code that never runs.

Quote:

And also for those reasons ( and also mainly for learning purpose ), i do not use any kind of stl stuff including their string class .... And again, it's for performances AND for learning purposes that i'm doing that ...

Those two are at odds. If you want to write a string class to learn, fine (it's not among the best ways to learn, because you won't really know if you've made stupid design or implementation choices, but I digress). However, what you write to learn should not be intended for production code. It is highly unlikely that you will be capable of outperforming the SC++L string class, in general, at this moment in time. std::basic_string<char,T> where T is a custom allocator will probably alleviate the worst of the memory pressure.

Your responses suggest that you're not particularly well-informed regarding the SC++L's performance or methods to increase or tweak its performance, nor do you seem to be particularly familiar with some of the relevant optimization methods. This suggests that writing the class for educational purposes is a halfway decent plan, but thinking that you're writing it for "performance reasons" and to support theoretical console deployment is folly.

Quote:

Can you just tell me a little more about your small object memory allocators ??
Do you have a link to a page that explain how does it works ???

The book "Modern C++ Design" has some prose on small object allocators. Google turns up some relevant results as well.

Share this post


Link to post
Share on other sites
Quote:
By using this allocator, does the problem of fragmentation just move from system memory to custom allocator memory ???


Yes.

It'll be even worse, since your allocator will not be scalable or thread-safe, unless you go to incredible lengths to ensure that. And at that point, you'll be back at square one.

The first solution to allocations is to avoid them. May sound stupid, but rather than trying to solve a non-existant problem, reduce the number of allocations first.

Pass strings by reference. Allocate them once. When concatenating, use pre-allocated buffers.

And this should eliminate the bulk of your problem. This way you also pinpoint where allocations are necessary. There are very few places where you really need to allocate strings - concatenation is not one of them, neither is passing them around.

As always - the fastest operation and best allocation is the one not performed.

And until you start profiling and testing your console on actual console, unless you have heaps of experience with such systems, you *cannot* determine what the problems will be.

Will fragmentation be a problem? Possibly. But will your game really run for that long, or will it purge entire state every 5 minutes, releasing all of the allocated memory?

The only benefits from custom memory allocation come from optimizing for usage patterns, not some abstract general cases. And even then that only happens when, and only when, you have determined that your application will have problems with that, after you have eliminated all the redundant allocations.

Share this post


Link to post
Share on other sites
Quote:
Original post by ClementLuminy
And also for those reasons ( and also mainly for learning purpose ), i do not use any kind of stl stuff including their string class ....


I made this mistake as well.

Rather than learning how to reinvent the wheel (which usually turns out to be some giant monstrosity with jagged edges, holes and usually ends up running away and killing small innocent animals), learn how to use STL well.

STL isn't just a straight forward shortcut. Sure any moron can USE it, but you need to know how to use it CORRECTLY to make the most of its power.

Learn STL. It is your friend. It might even help you wheel those poor little animals to a vet.

Share this post


Link to post
Share on other sites
Quote:
Original post by instinKt
Learn STL. It is your friend.

I see the STL more as my enemy, to be honest. The implementation of std::string isn't very good at all in most commonly used standard library implementations, especially the memory handling part (and that includes msvcp80).

Now, I fully agree with everybody who said that you should profile first. Do not simply assume a bottleneck, because there's a 99% chance you will be wrong. It takes a lot of experience to be able to assume an a-priori bottleneck, and it's not uncommon even for pros to be completely off with such an assumption.

However, once you have determined that a particular implementation of std:string becomes a bottleneck (which isn't that uncommon, depending on how liberally you use strings in your application), then implementing your own is useful. But as others have mentioned, it takes quite some experience to be able to design a well behaved and efficient string class. And making it MT-safe is really hard.

As Alan Kemp above, I had very good experiences using a small object allocator. Very small strings are common as temporary objects, for example during concatenation operations or by-value function calls/returns, so a lot of strings are very short lived. A small object allocator is perfect for these. I use it up to sizes of 512 byte, which reduces waste to a very acceptable level, especially since reuse is high. Above that, I use custom allocation pools with on the fly usage histograms, that will try to foresee the next allocations, and preallocate appropriate memory blocks. For my specific application, that makes heavy use of strings, this implementation is around three times faster than std::string (from VC8).

Quote:

Pass strings by reference. Allocate them once. When concatenating, use pre-allocated buffers.

And this should eliminate the bulk of your problem. This way you also pinpoint where allocations are necessary. There are very few places where you really need to allocate strings - concatenation is not one of them, neither is passing them around.

[rant]
Why ? Why should I, as a programmer, have to bend to the imperfections of a third party library, that imposes me certain usage patterns and arbitrary restrictions ? For example, why should I have to use clumsy preallocated buffers, when doing a simple concatenation chain is so much more elegant ? Well, there is no reason why. A standard library has to adapt to the usage pattern of the programmer, not the other way round. Especially if there is no technical reason not to do so.

Memory allocation management (or the lack thereof) is one of the worst parts of the STL (or SC++L, or whatever is its fancy name-of-the-day). Rewriting custom memory allocators for all STL container classes can speed you up an order of magnitude, even for general cases. Unfortunately, STLs current custom allocator interfaces are as bad as the allocators themselves.
[/rant]

Anyway, ClementLuminy, profile before you go into redesigning a custom string class or allocator. Because it will be a rough ride...

Share this post


Link to post
Share on other sites
Quote:
Original post by Yann L
Quote:
Original post by instinKt
Learn STL. It is your friend.

I see the STL more as my enemy, to be honest. The implementation of std::string isn't very good at all in most commonly used standard library implementations, especially the memory handling part (and that includes msvcp80).


I would say that to know when to not use STL one would need to have a firm understanding of STL to begin with.

You could spend a lot of time writing your own custom templates/classes/functions to do what STL can already do, and a lot of the time you'll end up performing worse than STL and that has been documented in many individual dev. journals.

Like you say, profile. Use the libraries already supplied and then profile. If you find that the libraries you've used are a bottleneck then spend the extra time trying to do it better.

Share this post


Link to post
Share on other sites

This topic is 3848 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this