Using variable argument lists and standard C I/O to construct C++ strings

This topic is 3629 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

Recommended Posts

I have a C++ function that looks like this.
std::string LogManager::Message(const char* message, ...)
{
char buffer[200];
va_list arg_list;
va_start(arg_list, message);
if (vsprintf(buffer, message, arg_list) < 0)
{
va_end(arg_list);
Warning(__FILE__, __LINE__, "invalid formatted string: %s\n", message);
return;
}
va_end(arg_list);

string msg_string(buffer);
return msg_string;


Basically all it does is take a printf-like formatted string and a variable argument list and uses that to construct and return a C++ string with all that data embedded. This is all fine and good, except for the temporary char array used in the vsprintf call. I arbitrarily have its size set to 200, but if I get a message argument that is greater than this size, it won't work correctly. That's easy enough to fix though -- I can go through the parameter message and find out exactly how large it is. But that size still isn't large enough, because the user might make a call like: "LogManager::Messages("%s\n", "and this is a really long string that the function is unprepared to handle"); Just looking at the size of the message string, the function would see that it is a size of 5. But we'll need a much, much larger buffer to because of the size of the additional string argument. I could arbitrarily bump up the size of the char array to be the size of the message string plus some arbitrary large number to increase the size, but again this wouldn't be guaranteed to always work. My question is this. Is there a good means in which to ensure that this function will always construct the string correctly, regardless of the size of or number of additional arguments? The only solution I can think of is to open a temporary file, use vfprintf to print the string to that file, and then read the string back from the file and close it. But this is very inefficient for something as basic as what this function is intended to do. Note that I'm -not- looking for an answer that involves using C++ iostreams or the like (I can't use them for this particular application), so please don't just tell me to use C++ libs instead of C ones. Thanks for any advice you can give![smile]

Share on other sites
You can use vsnprintf() to specify the number of characters to write. If the return value indicates that the buffer was too small resize the buffer accordingly and try it again.

Share on other sites
Quote:
 Original post by RootsI have a C++ function that looks like this....Note that I'm -not- looking for an answer that involves using C++ iostreams or the like (I can't use them for this particular application),

Why not? Especially considering you apparently can use std::string?

Quote:
 so please don't just tell me to use C++ libs instead of C ones.

Then why are you claiming to write C++?

Share on other sites
Thanks SiCrane, that should solve my problem. I wasn't aware of the existence of those functions.

Zahlman: My project requirements dictate that I'm not allowed to use any file or other streams. I don't know the exact reason, but I think it had something to do with streams not being supported in all embedded environments or not being supported in kernel mode on some operating systems. However using C++ strings is allowed.

Share on other sites
I use boost::format for this. It uses string streams internally so it's type-safe and can insert anything with operator<< into a string.

You're going to have a lot of trouble using vsprintf. Type safety is a big issue, there's just no way to know the type of the variables on the stack. You can easily overflow that buffer array. It also can't print any types other than the native types. The boost::format library is far, far superior.

Edit: Missed the thing about not being able to use streams. If there's no streams, there's no type safety, so things get boring and tricky really fast. You should at least be using vsnprintf. That way you can tell if you filled the buffer before all characters were written (compare its return value to the length of your buffer).

On a side note, why can't you just fprintf to the log file? Why do you even need to form the string in the first place? It just seems like an unnecessary step.

Share on other sites
Quote:
 Original post by RootsZahlman: My project requirements dictate that I'm not allowed to use any file or other streams. I don't know the exact reason, but I think it had something to do with streams not being supported in all embedded environments or not being supported in kernel mode on some operating systems. However using C++ strings is allowed.

That's a bit odd.

But still, nothing prevents you from learning from the design of the stream interface. It's typesafe and extensible. Given std::string, it shouldn't even be too hard to bang out a cheap imitation of std::stringstream, either.

Share on other sites
Quote:
 Original post by RootsZahlman: My project requirements dictate that I'm not allowed to use any file or other streams. I don't know the exact reason, but I think it had something to do with streams not being supported in all embedded environments or not being supported in kernel mode on some operating systems. However using C++ strings is allowed.

std::ostringstream != std::cout.

Unless you're lacking actual definition or implementation, std::ostringstream shouldn't do anything funkier than std::string does. They might even share some common code.

And mixing kernel mode and such arbitrary memory allocation gives me goosebumps. In such dangerous parts, fixed-sized buffers are usually quite excusable. But I digress...

Share on other sites
You'll probably want to use alloca if you're going to be making the buffer a variable size. Even then it's a bit tricky or dangerous to be resizing your buffer.

A very very high percentage of professional console game studios use C-style I/O like this for debug prints, logging, and localization. I'm betting most of them just allocate an arbitrary size that "should be large enough" and use vsnprintf to make sure they don't overflow it. If you do it right the resulting string will just be capped.

Then again you're using std::string so you could do multiple vsnprintfs in a loop and concat, but like Antheus stated/implied, the ramifications of that might not be great depending on how and where your routine is called.

Share on other sites
FYI: this program is something internal to my company that won't be used or known outside of a group of less than 10 people, so I'm not concerned about security issues like buffer overflowing and such. I'm also against introducing boost into it just for this particular problem, and I'm sure that if I did include it as a requirement to this application my tech lead would make me remove it. And the reason why I am converting it into a string instead of just outputting it to the logfile is because there's actually more to this function that parses and modifies the message string. The string class makes this task much easier than having to parse and modify a char*.

Anyway, here is my solution to the problem. Haven't tested it yet, but hopefully it does the trick.

std::string LogManager::Message(const char* message, ...){   va_list arg_list;   va_start(arg_list, message);   // Use vsnprintf to get the size of the string as it would be rendered   int string_length = vsnprintf(NULL, 0, message, arg_list);   if (string_length <= 0)   {      Warning(__FILE__, __LINE__, "invalid formatted string: %s\n", message);   }      // Now use vsnprintf to construct the string   char buffer[string_length];   if (vsnprintf(buffer, string_length, message, arg_list) < 0)   {      va_end(arg_list);      Warning(__FILE__, __LINE__, "invalid formatted string: %s\n", message);      return;   }   va_end(arg_list);   string msg_string(buffer);   return msg_string;}

Share on other sites
Using a constant that isn't known at compile time as an array size isn't allowed by the C++ standard. Some compilers however will support this. If you want the code to be portable, you could use a std::vector<char> as the buffer.

Share on other sites
exwonder's suggestion already handles the not-known-at-compile-time problem. alloca solves that exact problem, and is a simpler solution in this case.

Share on other sites
The way you're using vsnprintf() is already non-portable, so I wouldn't worry about the runtime buffer portability problem unless you plan to rewrite the whole thing.

Share on other sites
Quote:
 Original post by RootsThanks SiCrane, that should solve my problem. I wasn't aware of the existence of those functions.Zahlman: My project requirements dictate that I'm not allowed to use any file or other streams. I don't know the exact reason, but I think it had something to do with streams not being supported in all embedded environments or not being supported in kernel mode on some operating systems. However using C++ strings is allowed.

Ugh. The last time I saw flaky requirements like that was at a defense contractor. At a previous job, for us it was: no dynamic allocations, no non-certified libraries, no exceptions, etc... and thus it removed almost all of the C++ standard library. It is quite strange doing C++ work with basically a subset of the C standard library, so I feel your pain.

One workaround we used, was we modeled the style of STL-containers and streams, but made versions that did no dynamic allocation, which was based off of a template parameter. Somewhat ugly, risked buffer overruns if not careful, but workable. It was the only way to make the terrible requirement workable.

Also, for any OS that doesn't have some STL pieces (for some reason), there's always STLport.

Share on other sites
Quote:
 Original post by Rydinarea defense contractor... and thus it removed almost all of the C++ standard library.

Er, did I read that right?

If so, my anti-military paranoia just became even more profound.

Share on other sites
Quote:
Original post by Zahlman
Quote:
 Original post by Rydinarea defense contractor... and thus it removed almost all of the C++ standard library.

Er, did I read that right?

If so, my anti-military paranoia just became even more profound.

You read it right. The STL containers use the 'new' and 'delete' keywords. These keywords were banned by a department of defense standard, and so all of the STL containers, streams, etc... were all taken out. For the few pieces that were left, I believe my company chose not to get them certified. So, we did C++ without the C++ standard library. Thankfully, I was only on that project for a year.

Ultimately, the story goes that in the past, they had gotten burned by C projects not being able to limit dynamic memory appropriately, and so there were systems where they eventually crashed from memory leaks that they couldn't track down. Stories like this were fairly common. Unfortunately, instead of taking a proactive approach and realizing that C++ offered ways to solve these issues (e.g.: cleaning up during destruction), they took the wrong approach and banned dynamic memory altogether. I will say that from my experience at defense companies, before I moved onwards and upwards, that the majority of programmers at those companies had very little understanding of C++, but were pretty good at C. The majority of managers had no understanding of C++. Most of them thought that this defense standard was the safe way to do things and supported it, believe it or not. Actual important concepts like type safety were understood by a small minority.

As the saying goes, join a defense company if you want to be on the bleeding edge... of 20 year old technology. [wink]

Share on other sites
alloca isn't standard ANSI C though, so I'm not sure about using it. In the future its likely that this code will have to run on VxWorks and Pharlap, operating systems that I'm sure you're all familiar with [wink] (I never heard of them until starting my current job).

Quote:
 Original post by rip-offUsing a constant that isn't known at compile time as an array size isn't allowed by the C++ standard. Some compilers however will support this. If you want the code to be portable, you could use a std::vector as the buffer.

Will the call to vsnprintf still be happy though, since it expects a pointer to a character? I mean, I would figure that a vector lays out the string in memory the same way an array would...I'm just a little cautious about mixing C++ and C in that manner.

Share on other sites
Quote:
 Original post by Rootsalloca isn't standard ANSI C though, so I'm not sure about using it. In the future its likely that this code will have to run on VxWorks and Pharlap, operating systems that I'm sure you're all familiar with [wink] (I never heard of them until starting my current job).

VxWorks!!!! UGH. NOW I feel your pain. The operating system from hell.

Share on other sites
OK, I haven't tested this at all extensively, or even double checked for typos, but consider it a template for the safest C++ standards-compliant implementation I can think of once streams are disallowed:
#include <cassert>#include <cstdio>#include <limits>#include <string>class LogManager{	private:		class LogLine;	public:		LogLine Message();};template < std::size_t value >struct log_10{	static std::size_t const result = log_10< value / 10 >::result + 1;};template <>struct log_10< 0 >{	static std::size_t const result = 0;};template < typename Type, bool is_integer, bool is_iec559 >struct size_of_type_helper{};template < typename Type >struct size_of_type_helper< Type, true, false >{	static std::size_t const size = std::numeric_limits< Type >::digits10 + std::numeric_limits< Type >::is_signed;};template < typename Type >struct size_of_type_helper< Type, false, true >{	// sign digit decimal-point (6 * digit) 'e' sign exponent	static std::size_t const size = 11 + log_10< std::numeric_limits< Type >::max_exponent10 >::result;};template < typename Type >struct size_of_type{	static std::size_t const size = size_of_type_helper< Type, std::numeric_limits< Type >::is_integer, std::numeric_limits< Type >::is_iec559 >::size;};class LogManager::LogLine{	public:		LogLine & operator<<(bool b)		{			if (b)			{				buffer_ += "true";			}			else			{				buffer_ += "false";			}			return *this;		}		LogLine & operator<<(int i)		{			print("%i", i);			return *this;		}		LogLine & operator<<(unsigned int i)		{			print("%u", i);			return *this;		}		LogLine & operator<<(long l)		{			print("%li", l);			return *this;		}		LogLine & operator<<(unsigned long l)		{			print("%lu", l);			return *this;		}		// not strictly necessary but Borland 5.82 fails to recognise that		// float to double promotion is a better match that float to long		// double conversion, and issues an ambiguity error		LogLine & operator<<(float f)		{			print("%e", f);			return *this;		}		LogLine & operator<<(double d)		{			print("%e", d);			return *this;		}		LogLine & operator<<(long double d)		{			print("%Le", d);			return *this;		}		LogLine & operator<<(char c)		{			buffer_ += c;			return *this;		}		LogLine & operator<<(char const * s)		{			buffer_ += s;			return *this;		}		LogLine & operator<<(std::string const & s)		{			buffer_ += s;			return *this;		}		operator std::string() const		{			return buffer_;		}	private:		template < typename Type >		void print(char const * format_string, Type value)		{			char buffer[size_of_type< Type >::size + 1];			int count = std::sprintf(buffer, format_string, value);			assert(count < sizeof(buffer));			buffer_ += buffer;		}		std::string buffer_;};LogManager::LogLine LogManager::Message(){	return LogLine();}void print(std::string const & string){	std::puts(string.c_str());}int main(){	LogManager l;	print(l.Message() << 32 << 1.0f);}

%e for floating-point isn't ideal, but you get the idea.

Σnigma

EDIT: The return value of sprintf excludes the terminating null character, so my assert was off-by one.

Share on other sites
Quote:
 Original post by RootsWill the call to vsnprintf still be happy though, since it expects a pointer to a character? I mean, I would figure that a vector lays out the string in memory the same way an array would...I'm just a little cautious about mixing C++ and C in that manner.

std::vector guarantees that it's storage will be contiguous. You can pass &vec.front(), &vec[0] or even &*vec.begin() to a function that accepts char *.

Share on other sites
Quote:
 Original post by EnigmaOK, I haven't tested this at all extensively, or even double checked for typos, but consider it a template for the safest C++ standards-compliant implementation I can think of once streams are disallowed:[...]
One problem in your code: std::numeric_limits<Type>::digits10 is the maximum number of base-10 digits the number can hold, not the number of base-10 digits needed to display the longest valid value. In other words, it will hold 9 for an unsigned 32-bit integer because it can hold 109 but not 1010, but 232-1 (which is a valid value for such an integer) takes 10 digits to display in base-10 : 4294967295.

Share on other sites
Quote:
 Original post by Rootsalloca isn't standard ANSI C though, so I'm not sure about using it. In the future its likely that this code will have to run on VxWorks and Pharlap, operating systems that I'm sure you're all familiar with [wink] (I never heard of them until starting my current job).

VLAs are standard C99 and you'll probably find them even less supported than alloca(). Honestly, your best solution is to cap to a fixed size. Warn if the string exceeded that size. Then go enlarge it manually.

alloca() (or VLAs, if you by some miracle have them) is the next best solution.

If you really really need it to print the whole string always and forever and you really really don't want to use alloca, the next best solution is to allocate it on the heap. This is less than ideal for a lot of reasons, but you're guaranteed no fragments since the scope of your allocation is this function alone. (Edit: this is totally false. You *would* be guaranteed no fragments if you weren't copying it to a std::string. Is there some reason you can't do whatever you're doing with that std::string inside of this function? The entire string usage is a bit suspect and this tends to indicate that there's a better way to accomplish what you're trying to do at a higher level, but I'm willing to give you the benefit of the doubt.)