# Optimizing char translation

This topic is 5064 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Is it possible to optimize the following piece of code? There are many unnecessary string copies...

std::wstring ascii_cast(const char *Ascii)
{
// Get number of char's in the string
size_t Length = std::strlen(Ascii);

// Create an temporarily buffer
std::vector<wchar_t> Wide(Length + 1);

// Convert 'Ascii' to Unicode
MultiByteToWideChar(CP_ACP, 0, Ascii, -1, &Wide[0], (int)Length + 1);

// Create an temporarily wstring
std::wstring WideStr(&Wide[0]);

// Return the Unicode string
return WideStr;
}



##### Share on other sites
Doesn't std::wstring inherit from std::vector anyway, so shouldn't it be possible to convert directly onto the result string? Lets say by calling the reseve() method to ensure that the string is large enough and letting MultiByteToWideChar write directly into it.
Also, it should be more efficient to pass the wstring as reference instead of returning a copy.
If you really need to maximize the performance you could make an initial guess about the maximum string size and gradually convert more as needed, calling strlen() on memory mapped multi-megabyte documents is far from free.

##### Share on other sites
> Doesn't std::wstring inherit from std::vector anyway

Imho not.

> so shouldn't it be possible to convert directly onto the result string

I dont' think there is a way to write directly to a std::string...

> Also, it should be more efficient to pass the wstring as reference instead of returning a copy.

Wouldn't the local copy of wstring destroy before?

##### Share on other sites
Quote:
Original post by TrueTom
Quote:
 Original post by doynaxDoesn't std::wstring inherit from std::vector anyway

Imho not.

Maybe not, and it's defeneatly not guaranteed. I just wanted to point out that it could be since it's nothing more special case of a vector in most implementations anyway.

Quote:
 from SGI's documentation on basic_stringNote that the C++ standard does not specify the complexity of basic_string operations. In this implementation, basic_string has performance characteristics very similar to those of vector: access to a single character is O(1), while copy and concatenation are O(N).

Quote:
Original post by TrueTom
Quote:
 Original post by doynaxso shouldn't it be possible to convert directly onto the result string

I dont' think there is a way to write directly to a std::string...

Maybe not a safe way. But having I seriously doubt that you'll find a non-vector implementation, so a fast hack should be possible (maybe coupled with an #ifdef just in case you find one where it doesn't work)

Quote:
Original post by TrueTom
Quote:
 Original post by doynaxAlso, it should be more efficient to pass the wstring as reference instead of returning a copy.

Wouldn't the local copy of wstring destroy before?

Yes - and that's the problem. It creates a local copy of the string which is later once again copied to the return variable, which is probably copied once more when it's later processed by the callee.
A reference parameter would avoid this.

If performance becomes a problem and want to avoid any ugly hacks you should consider using a custom string class instread. That way you gain a lot of flexibility. Knowing the input string's size in advance would also help to speed things up a bit (that's my main problem with c-strings).

##### Share on other sites
#include <malloc.h>std::wstring ascii_cast(const char* ascii, int length){ wchar_t* buffer = reinterpret_cast<wchar_t*>(_alloca(length * sizeof(wchar_t))); MultiByteToWideChar(CP_ACP, 0, ascii, length, buffer, length); return std::wstring(buffer, length);}

Commentary: caller probably has the length of the ASCII text, so make them pass it in. Allocates a temporary buffer *on the stack* (which is extremely fast), converts the text (if you have the length, no need to make MultiByteToWideChar compute it all over again!), and then tries to facilitate the return value optimization.

##### Share on other sites
> A reference parameter would avoid this.

You are right, it can be avoided.

But returning the vector is shorter:

return &Wide[0];

Thank's for your help, performance isn't such a problem, just wanted to do it without wasting to much resources.

##### Share on other sites
Quote:
 Original post by TrueTomThank's for your help, performance isn't such a problem, just wanted to do it without wasting to much resources.
Yeah, just do whatever you feel is easiest until you run into performance problems.

I'll post this anyway since I had fun writing it, maybe you'll need it someday.
typedef std::vector<char> aString;typedef std::vector<wchar_t> uString;bool asciiToUnicode(const aString &input, uString &output) {	size_t iSize = input.size();	output.reserve(iSize);	size_t oSize = MultiByteToWideChar(		CP_ACP,		0,		input.begin(),		-1,		output.begin(),		oSize	);	output.resize(oSize);	return !iSize || oSize;}

It's untested but apart from optimizing the vector allocations this should be about as fast as a MultiByteToWideChar wrapper will get.

1. 1
Rutin
40
2. 2
3. 3
4. 4
5. 5

• 18
• 20
• 14
• 14
• 9
• ### Forum Statistics

• Total Topics
633368
• Total Posts
3011531
• ### Who's Online (See full list)

There are no registered users currently online

×