Back to General and Gameplay Programming

String performance in C++?

General and Gameplay Programming Programming

Started by 1hod0afop July 10, 2008 01:56 AM

25 comments, last by chairthrower 15 years, 9 months ago

100

July 11, 2008 01:43 AM

Quote:Original post by 1hod0afop
Quote:Original post by EasilyConfused
Are you sure that you are not profiling in debug mode? That seems like a huge time difference.

OK here's a simple program
*** Source Snippet Removed ***

In Visual Studio 2005, it outputs
Debug build : 3015 ns
Release build : 860 ns

However, by linking CRT library static it gets a little faster
Release build /MT : 657 ns

GetTickCount is in milliseconds, not nanoseconds btw.

1hod0afop

100

Author

July 11, 2008 01:47 AM

Quote:Original post by thre3dee
GetTickCount is in milliseconds, not nanoseconds btw.

I know. That's why the program loops 1,000,000 times.

1hod0afop

100

Author

July 11, 2008 02:06 AM

Quote:Original post by EasilyConfused
Quote:Original post by 1hod0afop
Quote:Original post by Ra
Give an example of your supposed usage. It's probably abusive.

I'm highly negative of using std::string. On my computer just concatenating 2 std::string's each 10 characters-long takes 0.9 miliseconds! (don't think I'm using 10-year-old or such antique machine) If I instead use strcpy/strcat it takes only 50 nanoseconds. That's why I never use std::string. Now the reason why I'm doing this is to allow such abuse of strings. Actually the guys behind me are using std::string now also. They're making me crazy. Analyzing their code, find such abuses, optimizing ... It's time to end this crazy course.

Are you sure that you are not profiling in debug mode? That seems like a huge time difference.

Sorry for my mistake, microseconds. That's still too slow for me. ^^

iMalc

2,466

July 11, 2008 02:14 AM

Quote:Original post by 1hod0afop
OK here's a simple program
*** Source Snippet Removed ***

That's not valid C++. main must be declared as returning int, NOT void.

I personally see 860ns for 1 million string concatenations as being blazingly fast. That's over 1.1 billion per second!
If you ever find such std::string usage to be a bottleneck in a real-life program then I'd be happy to hear from you. (Assuming GDev.net is still around then)

"In order to understand recursion, you must first understand recursion."
My website dedicated to sorting algorithms

1hod0afop

100

Author

July 11, 2008 02:22 AM

Quote:Original post by iMalc
Quote:Original post by 1hod0afop
OK here's a simple program
*** Source Snippet Removed ***
That's not valid C++. main must be declared as returning int, NOT void.

I personally see 860ns for 1 million string concatenations as being blazingly fast. That's over 1.1 billion per second!
If you ever find such std::string usage to be a bottleneck in a real-life program then I'd be happy to hear from you. (Assuming GDev.net is still around then)

No, the output 860ns is for 1 string concatenation, meaning 1.1M per second.

Antheus

2,410

July 11, 2008 06:57 AM

Quote:Original post by 1hod0afop

Sorry for my mistake, microseconds. That's still too slow for me. ^^

What you're doing is:
- allocate new string
- two strings are concatenated into temporary copy, causing heap allocation
- temporary string is larger than internal buffer of result, another heap allocation
- temporary is copied into result
- temporary is deleted, along with its heap allocation
- result goes out of scope, its heap allocation is deleted

Of course it's slower than strcat. You're doing 10 times more work, not to mention doing 2 heap allocations more.

But before you are able to analyze the problem, there really is no point in blaming the tools.

BTW: how many concatenations do you perform per second, how many per application life-time?

Bregma

9,461

July 11, 2008 07:10 AM

Quote:Original post by 1hod0afop
OK here's a simple program

Ah, good old hand-tuned benchmarks to give the desired performance characteristics.

I'll bet you anything the C orange you were comparing this C++ apple to was faster because it was not at all the same. My guess is that this would be a better C++ comparison. Post your C code and I could tell for certain.

#include <iostream>#include <string>int main(){	unsigned int uStart = GetTickCount();	std::string str1 = "0123456789";	std::string str2 = "abcdefghij";	std::string concat;	for( int i = 0; i < 1000000; i ++ )		concat = str1 + str2;	unsigned int uElapsed = GetTickCount() - uStart;	std::cerr << uElapsed << " ns\n";}

I don't have access to Visual Studio, so I'd ask you to run that through your tests and report on it.

I would argue that ignorance of how to use a tool is not a valid reason to criticize a tool as inadequate for the job.

Stephen M. Webb
Professional Free Software Developer

Trienco

2,555

July 11, 2008 04:02 PM

Most of all I would check the compiler output, because I'm naive enough to expect any semi-decent compiler to detect the pointlesness of the entire loop and just optimize it away.

f@dzhttp://festini.device-zero.de

1hod0afop

100

Author

July 12, 2008 02:28 AM

Quote:Original post by Bregma
Post your C code and I could tell for certain.

Here

#include <windows.h>#include <stdio.h>#include <string.h>void concat( char* pDest, const char* pStr1, const char* pStr2 ){	strcpy( pDest, pStr1 );	strcat( pDest, pStr2 );}void main(){	unsigned int uStart = GetTickCount();	const char* pstr1 = "0123456789";	const char* pstr2 = "abcdefghij";	char buf[21];	for( int i = 0; i < 1000000; i ++ )	{		_asm		{			mov ebx, pstr2			mov eax, pstr1			lea edx, buf			push ebx			push eax			push edx			call concat			add esp, 0ch		}	}	unsigned int uElapsed = GetTickCount() - uStart;	printf( "%d ns\n", uElapsed );}

My compiler often optimizes away strcpy/strcat function calls so I had to use tricks to fool the optimization. Here's the assembly code generated by VC2005.

... omitted ...;	const char* pstr1 = "0123456789";0040106E  mov         dword ptr [ebp-28h],offset string "0123456789" (4020F4h) ;	const char* pstr2 = "abcdefghij";00401075  mov         dword ptr [ebp-20h],offset string "abcdefghij" (402100h) ;	for( int i = 0; i < 1000000; i ++ )	{0040107C  mov         esi,0F4240h ;		_asm;		{;                       mov ebx, pstr200401081  mov         ebx,dword ptr [ebp-20h] <------+;                       mov eax, pstr1               |00401084  mov         eax,dword ptr [ebp-28h]        |;                       lea edx, buf                 |00401087  lea         edx,[ebp-1Ch]                  |;                       push ebx                     |0040108A  push        ebx                            |;                       push eax                     |0040108B  push        eax                            |;                       push edx                     |0040108C  push        edx                            |;                       call concat                  |0040108D  call        concat (401000h)               |;                       add esp, 0ch                 |00401092  add         esp,0Ch                        |00401095  sub         esi,1                          |00401098  jne         main+31h (401081h) ------------+;		};	}... omitted...;void concat( char* pDest, const char* pStr1, const char* pStr2 );{;	strcpy( pDest, pStr1 );00401000  mov         eax,dword ptr [esp+8] ; pStr100401004  push        esi  00401005  push        edi  00401006  mov         edi,dword ptr [esp+0Ch] ; pStr20040100A  mov         edx,edi 0040100C  sub         edx,eax 0040100E  mov         edi,edi 00401010  mov         cl,byte ptr [eax] ; pDest  <--+00401012  mov         byte ptr [edx+eax],cl         |00401015  add         eax,1                         |00401018  test        cl,cl                         |0040101A  jne         concat+10h (401010h) ---------+;	strcat( pDest, pStr2 );0040101C  mov         eax,dword ptr [esp+14h] 00401020  mov         edx,eax 00401022  mov         cl,byte ptr [eax] 00401024  add         eax,1 00401027  test        cl,cl 00401029  jne         concat+22h (401022h) 0040102B  sub         eax,edx 0040102D  add         edi,0FFFFFFFFh 00401030  mov         cl,byte ptr [edi+1] <---------+00401033  add         edi,1                         |00401036  test        cl,cl                         |00401038  jne         concat+30h (401030h) ---------+0040103A  mov         ecx,eax 0040103C  shr         ecx,2 0040103F  mov         esi,edx 00401041  rep movs    dword ptr es:[edi],dword ptr [esi] 00401043  mov         ecx,eax 00401045  and         ecx,3 00401048  rep movs    byte ptr es:[edi],byte ptr [esi] 0040104A  pop         edi  0040104B  pop         esi  ;}0040104C  ret

On my machine the program gives me the result "94 ns". But wait! If I make my own string class, I can optimize further. (if heap allocation cost is low enough) How? I can store length of string in the string class, so I can eliminate 2 loops (401010 - 40101A, 401030 - 401038), which allows me double the speed.

Quote:Original post by Bregma
I would argue that ignorance of how to use a tool is not a valid reason to criticize a tool as inadequate for the job.

Quote:Original post by Antheus
But before you are able to analyze the problem, there really is no point in blaming the tools.

Certainly. But I'm talking about tradeoffs between performace and simplicity. One can get similar performance with std::string by writing tricky code. What's the point of doing that? If you have to seriously consider using "+" overloaded operator, why don't you just sweep away it? And you won't be able to get same performance as strXXX functions since std::string always involves heap operation.

If you can obtain high performance in convenient way, (for example, let's say SOMETHING::string concat = str1 + str2; takes 100 ns) will you still adhere to std::string? (Don't try to bother me with porting problems. Let's imagine SOMETHING::string is fully compatible with std::string)

Regards

KulSeran

3,272

July 12, 2008 03:12 AM

1) I think you mean miliseconds (GetTickCount isn't nanoseconds)
2) you can always call std::string::reserve if you know you are doing a lot of
concatanations (equivelent to your wonderful buf[21] trick)
3) If string are you bottleneck, and you don't NEED strings, you may as well replace them with something like CRC hashes of the strings in question.
4) If you do NEED string, you may want to rethink your algorithm anyway. Lots of temporary copies and concatenations
can be a sign of you doing it wrong to start with. Changing how you make you pass over the dataset can mean that you
don't even need to be making copies of anything.
5) get a custom allocator for the string, especially if you know you are going to be allocating many small strings.

You have to remember that std::blah is a GENERAL PURPOSE tool. You can always speed it up by giving it knowledge about the
current situation you are using it in. And you can probably write a faster SINGLE PURPOSE tool, but you have to weigh in the cost
of time spent on your tool vs. just using a proven library.

My Blog

String performance in C++?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

String performance in C++?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines