Jump to content
  • Advertisement
Sign in to follow this  
Alundra

Safe variadic string format but efficient

This topic is 650 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi,
I did some test of a safe variadic string format to format a string like : "Date {0} and time {1}".
All works nice but it's far from the perf of the sprintf even doing a reserve of capacity to avoid lot of allocations, it's 2 times slower than sprintf.
#include <iostream>
#include <string>
#include <vector>
#include <chrono>

template<class T>
void FormatterOutput(std::string& out, const T& v);

template<>
void FormatterOutput(std::string& out, const char& v)
{
	out += v;
}

template<>
void FormatterOutput(std::string& out, const char* const& v)
{
	out += v;
}

template<>
void FormatterOutput(std::string& out, const int8_t& v)
{
	out += std::to_string(v);
}

template<>
void FormatterOutput(std::string& out, const uint8_t& v)
{
	out += std::to_string(v);
}

template<>
void FormatterOutput(std::string& out, const int16_t& v)
{
	out += std::to_string(v);
}

template<>
void FormatterOutput(std::string& out, const uint16_t& v)
{
	out += std::to_string(v);
}

template<>
void FormatterOutput(std::string& out, const int32_t& v)
{
	out += std::to_string(v);
}

template<>
void FormatterOutput(std::string& out, const uint32_t& v)
{
	out += std::to_string(v);
}

template<>
void FormatterOutput(std::string& out, const int64_t& v)
{
	out += std::to_string(v);
}

template<>
void FormatterOutput(std::string& out, const uint64_t& v)
{
	out += std::to_string(v);
}

template<>
void FormatterOutput(std::string& out, const float& v)
{
	out += std::to_string(v);
}

class Formatter
{
public:

	void Format(const std::string& format)
	{
		mString.clear();
		mString.reserve(format.size());
		std::string::const_iterator i = format.begin();
		while (i != format.end())
		{
			if (*i == '{')
			{
				if (*(i + 1) == '{')
					++i;
			}
			mString += *i++;
		}
	}

	template<class... Args>
	void Format(const std::string& format, Args... args)
	{
		mString.clear();
		std::vector<std::string> formattedArgs;
		BuildValues(formattedArgs, args...);
		size_t reserveSize = format.size();
		for(int i=0; i<formattedArgs.size(); ++i)
			reserveSize += formattedArgs.size();
		mString.reserve(reserveSize);
		WriteValues(formattedArgs, format);
	}

	const std::string& GetString() const
	{
		return mString;
	}

private:

	template<class T>
	void BuildValues(std::vector<std::string>& formattedArgs, T value)
	{
		std::string s;
		FormatterOutput<T>(s, value);
		formattedArgs.push_back(std::move(s));
	}

	template<class T, class... Args>
	void BuildValues(std::vector<std::string>& formattedArgs, T value, Args... args)
	{
		std::string s;
		FormatterOutput<T>(s, value);
		formattedArgs.push_back(std::move(s));
		BuildValues(formattedArgs, args...);
	}

	void WriteValues(const std::vector<std::string>& args, const std::string& format)
	{
		std::string::const_iterator i = format.begin();
		std::string::const_iterator endIterator = format.end();
		while (i != endIterator)
		{
			if (*i == '{')
			{
				if (*(i + 1) == '{')
				{
					++i;
				}
				else
				{
					++i;
					std::string s;
					while (i != format.end() && *i != '}')
						s += *i++;
					++i;
					const std::size_t idx = std::stoi(s);
					if (idx < args.size())
					{
						mString += args[idx];
						continue;
					}
				}
			}
			mString += *i++;
		}
	}

private:

	std::string mString;
};

int main()
{
	std::chrono::high_resolution_clock high_resolution_clock;
	auto startTime = high_resolution_clock.now();
	char buffer[2048];
	for(int i=0; i<100000; ++i)
		sprintf_s(buffer, 2048, "test of a float converter : %f", 3.14f);
	auto endTime = high_resolution_clock.now();
	std::chrono::duration<double> diff1 = endTime - startTime;
	std::cout << "sprintf: " << diff1.count() << " seconds" << std::endl;

	startTime = high_resolution_clock.now();
	Formatter f;
	for(int i=0; i<100000; ++i)
		f.Format("test of a float converter : {0}", 3.14f);
	endTime = high_resolution_clock.now();
	std::chrono::duration<double> diff2 = endTime - startTime;
	std::cout << "formater: " << diff2.count() << " seconds" << std::endl;
	std::cout << "formater is " << diff2.count() / diff1.count() << " times slower" << std::endl;

	system("PAUSE");
	return 0;
}
This code use std::to_string and std::stoi, surely a custom converter for float and others can improve the speed but surely it's not the only one point which cause this bad perf.
How to improve the performance ?
Thanks
Edited by Alundra

Share this post


Link to post
Share on other sites
Advertisement

1) Are you testing in release/optimized mode? sprintf will probably have a better performance in debug mode, since everything regarding STL and iterators is usually really slow in debug mode.

 

2) You are not using reserve on

std::vector<std::string> formattedArgs;

You can use

constexpr auto numArgs = sizeof ... (Args)
formattedArgs.reserve(numArgs );

To reserve based on the number of variadic parameters.

 

Actually, since you know the amount of args at compile-time, you could use a compile-time array instead:

constexpr auto numArgs = sizeof ... (Args)
std::array<std::string, numArgs> formattedArgs;

Which might slighly improve performance due to only using the stack. Obviously now you cannot use push_back anymore, but you could count up an index yourself:

template<class T>
    void BuildValues(std::array<std::string, numArgs>& formattedArgs, size_t& index, T value)
    {
        std::string s;
        FormatterOutput<T>(s, value);
        formattedArgs[index++] = std::move(s);
    }

or if you want to be even faster, you can make "index" a template argument as well:

template<size_t Index, class T, class... Args>
    void BuildValues(std::array<std::string, NumArgs>& formattedArgs, T value, Args... args)
    {
        std::string s;
        FormatterOutput<T>(s, value);
        formattedArgs.push_back(std::move(s));
        BuildValues<Index+1>(formattedArgs, args...);
    }

But you need to decide whether this small performance improvement is worth the additional complexity.

 

 

 

3)

for(int i=0; i<formattedArgs.size(); ++i)
     reserveSize += formattedArgs.size();
mString.reserve(reserveSize);

Aside from the peephole-optimization of storing formattedArgs.size() in a variable outside of the loop (this can be done for the .end() calls in the two for-loops as well), this code is incorrect. You are essentially calculating formattedArgs.size() ^ 2, the correct code should be:

for(int i=0; i<formattedArgs.size(); ++i)
     reserveSize += formattedArgs[i].size();
mString.reserve(reserveSize);

Or since you are already using C++11, better yet:

for(const auto& arg : formattedArgs)
{
    reserveSize += arg.size();
}
mString.reserve(reserveSize);

Thats as far as I can see, aside from std::to_string also making a dynamic allocation, which you already mentioned.

Edited by Juliean

Share this post


Link to post
Share on other sites
Good points there, the fixed array instead of a dynamic array to avoid allocation is a good found since the size is known, also the mistake on the reserve of the final string.
I modified all the points and tried a custom double to string found there : https://github.com/miloyip/dtoa-benchmark
The actual time is 1.5 time slower than sprintf which is already better than 2 times slower previously.
Edited by Alundra

Share this post


Link to post
Share on other sites

Some more things I noticted:

 

1)

template<>
void FormatterOutput(std::string& out, const int8_t& v)
{
    out += std::to_string(v);
}

From what I can see "out" will always be empty in this call, so += will perform an allocation in "out" and throw away the memory returned by "std::to_string", since there is not operator+=(string&&) overload (and it wouldn't really make sense to have one eigther). So eigether change the method to assign to the string:

template<>
void FormatterOutput(std::string& out, const int8_t& v)
{
    out = std::to_string(v);
}

Or, better yet, change it to:

template<>
std::string FormatterOutput(const int8_t& v)
{
    return std::to_string(v);
}

and use it like this:

template<class T>
    void BuildValues(std::array<std::string, numArgs>& formattedArgs, size_t& index, T value)
    {
        formattedArgs[index++] = FormatterOutput<T>(value);
    }

Unless your compiler sucks at RVOs (it shouldn't), this can be faster than your variant (and will be at least equally fast), since it doesn't involve handling the "out" reference, and will save on some move-operations.

 

2) Actually, seeing how your FormatOutput-method seems to only take primitive-types, passing the value by const & is not necessary, and can result in an overhead, at least if the compiler doesn't properly inline the calls. I'd suggest changing it to

template<>
std::string FormatterOutput(int8_t v)
{
    return std::to_string(v);
}

and, if necessary, provide separate overloads for larger structures.

 

3) A pretty minor point too, but the BuildValues-method doesn't seem to use any member variable, so you should make it static, to save pushing the this-pointer on the stack (and because a method that doesn't mess with internals should be static anyways by good design :D)

 

First point might give a considerable gain, depending on if it still applies to your custom double to string, and 2/3 might only give a minescule benefit if at all, but I think its worth a try, since the changes are pretty minor anys. 1.5 from 2 is already good, but I'm curious to see if it goes even better.

Edited by Juliean

Share this post


Link to post
Share on other sites

Hey it's not so bad, with all the modifications I got 1.37 which is even better, I modified all += to avoid operator on the string too.

Edited by Alundra

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!