Detecting in string numbers higher than max possible Sint32

Started by
16 comments, last by Fred304 18 years, 6 months ago
Uhhhm, hi guys, stupid problem. I have std::string which contains a number, which may have any possible value (ie. 0, 128 or 9999999999999). Now, I'd like to know whether I can safely convert that string to 32 bit signed integer (Sint32 as in SDL) and store it in it. I'd like to use atoi() for conversion. If that number lies within [-2147483648, 2147483647], then we know that such operation is safe etc. nothing will go wrong. However, if that number is higher, then doing normal atoi conversion in such a way...


Sint32 number = atoi( my_string.c_str() );

 if ( (number > 2147483647) || ( number < -2147483648) )
  return // couldn't convert


... for obvious reasons won't work :-) I think, that I could loop over all chars from that string in reverse direction and try to manually construct such number and bail if adding another part would create risk for overflow, ie.


Sint32 my_number = 0;
for (int i = my_string.size(); i > 0; --i)
 {
  Sint32 temp = ((my_string.size() - i) * 10 * (my_string - '9'));
  if (my_number + temp > 2147483647 || my_number + temp < -2147483648)
   return ERROR;
  my_number += temp; 

 }


Ekhm, at this point I realized that even doing simple "my_number + temp" may result in overflow [crying] rendering this algorithm useless. And no, I don't want to use any external big numbers libraries, nor code my own version. Any thoughts?
Advertisement
#include <iostream>#include <sstream>#include <string>void parseInt(std::string num){	std::istringstream sstr(num);	int integer;	if (sstr >> integer)	{		std::cout << "parsed integer " << integer << " successfully\n";	}	else	{		std::cout << "invalid integer: " << num << '\n';	}}int main(){	parseInt("123456789123456789");	parseInt("2147483648");	parseInt("2147483647");	parseInt("-2147483648");	parseInt("-2147483649");	parseInt("-123456789123456789");}

Or you could use boost::lexical_cast or the C99 function strtol.

Interestingly enough this code demonstrated a flaw in the SC++L implementation with my Borland compiler, which incorrectly parsed "2147483648" as -2147483648. gcc and MSVC++ both worked fine though.

Enigma
Hmmmmm, once again I forgot about STL [rolleyes].

Though it's not the possibly best solution (I'm going to use it in SDL_Config library, which should include as little headers as possible for optimal use), it's very close to it, and I'm going to use it. Enigma, IIRC I rated you max up few months before, so unfortunately I can't do it now... anyway, thanks for the help :-)
couldnt you read it into a float or double and check the size thing?
Errrm, I was thinking about more sophisticated solutions, and as I see know, this completely undermined my ability to create simple solutions. Probably you are right, tomorrow (now it's too late) I'm going to check whether atof() will help me. Thanks :-)
Rip-off - you were right, atof trick works:

#include <string>#include <iostream>using namespace std;void parseInt(string num){ double integer = atof(num.c_str()); if ( (integer <= (double) 2147483647.0) && (integer >= (double) -2147483648.0f)) 	{		std::cout << "parsed integer " << num << " successfully\n";	}	else	{		std::cout << "invalid integer: " << num << '\n';	}}int main(){	parseInt("123456789123456789");	parseInt("-123456789123456789");	parseInt("2147483648");	parseInt("2147483647");	parseInt("-2147483648");	parseInt("-2147483649");	parseInt("123456789123456789123456789123456789123456789123456");	}


Output:

invalid integer: 123456789123456789invalid integer: -123456789123456789invalid integer: 2147483648parsed integer 2147483647 successfullyparsed integer -2147483648 successfullyinvalid integer: -2147483649invalid integer: 123456789123456789123456789123456789123456789123456


Thanks :-)
Quote:Original post by Enigma
or the C99 function strtol.
Both strtol and strtoul are actually C89 functions. So you can use them freely in both C and C++.

The standard library doesn't provide any way to do this with integers however so you'll have to clip them yourself, or rely on the fact that sizeof(long) = sizeof(int) on most platforms.
int strtoi(const char *p) { long result = strtol(p, NULL, 10); if(result < INT_MIN) { return INT_MIN; } if(result > INT_MAX) { return INT_MAX; } return (int) result;}
Hey, thanks, but this problem is already fixed :-)


Btw, I don't need to know the exact value of that integer, only whether it fits in Sint32.
Be aware that using atof only "works" in this case because you are testing for a number which is a power of two or one less that a power of two. Generally double-precision floating point numbers do not have enough precision to guarantee correct results. Try testing for a range of +-2000000000 to see what I mean. Furthermore that "works" is in inverted commas for a reason. In the event that the input number exceeds the valid range for a double-precision floating point number the result is technically undefined, even though all my implementations actually return +/-infinity.

In general there is no reason to prefer atoi and atof over C++ streams except for the very limited case where you can guarantee the range of input and need to perform string to integer/floating point values frequently in a tight loop and a profiler has demonstrated the conversion to be a bottleneck.

Enigma
Quote:
Be aware that using atof only "works" in this case because you are testing for a number which is a power of two or one less that a power of two. Generally double-precision floating point numbers do not have enough precision to guarantee correct results. Try testing for a range of +-2000000000 to see what I mean. Furthermore that "works" is in inverted commas for a reason. In the event that the input number exceeds the valid range for a double-precision floating point number the result is technically undefined, even though all my implementations actually return +/-infinity.


Hmmm, I executed this:

#include <string>#include <iostream>#include <sstream>using namespace std;typedef signed int Sint32;string temp;stringstream sstr;void checkint(Sint32 num) {  sstr.clear();  sstr << num;  temp.clear();  sstr >> temp;    double integer = atof(temp.c_str());   if ( !( (integer <= (double) 2147483647.0) && (integer >= (double) -2147483648.0f)) ) 	{		cout << "! Error ! Parsed integer " << num << " and got in result " << temp <<                     " which appears to be outside range\n";	}	  if (num % 1000)   cout << num << endl;	 }int main(){ for (Sint32 i = -7483648; i < 7483647; i += 10)  checkint(i);}


No errors occured, also when I changed
for (Sint32 i = -7483648; i < 7483647; i += 10)

to

for (Sint32 i = -2000000000; i < 2000000000; i += 1000)
// btw, it produced 40 MB txt file :-)

Also, I've tried going over whole range -2147483648 to 2147483647 by 1... but after few minutes, when history of cache use skyrocketed to the point not seen even when playing Doom 3 on my old computer I had to kill it. But even then, the re were no errors in txt file.

------

Uhm, this works too:

void checkint(Sint32 num) {  sstr.clear();  sstr << num;  temp.clear();  sstr >> temp;    double integer = atof(temp.c_str());   if ( !( (integer <= (double) 2147483647.0) && (integer >= (double) -2147483648.0f)) )		cout << "! Error ! Parsed integer " << num << " and got in result " << temp <<                     " which appears to be outside range\n";	   if (num != static_cast<Sint32> (integer))    cout << "! Error ! Parsed integer " << num << " and got in result " << static_cast<Sint32> (integer) <<                     " which appears to be different!\n";  }int main(){ for (Sint32 i = -2147483648;  i < 2147483647; ++i)  checkint(i); }


However, I couldn't wait until it finishes, so after few minutes once again I had to kill it. But there were no errors reported...


Btw, when my MinGW sees -2147483648, it complains about "[Warning] this decimal constant is unsigned only in ISO C90 " :-?

Quote:
In general there is no reason to prefer atoi and atof over C++ streams except for the very limited case where you can guarantee the range of input and need to perform string to integer/floating point values frequently in a tight loop and a profiler has demonstrated the conversion to be a bottleneck


For me, there is one, little, not-very-important, albeit existing reason: need to include <sstream>, which may slightly increase produced lib. Ok ok, I know that's going to add max. only few kb. Anyway, I've already implemented atof, and I'm going to change it only if it will cause any errors,

Btw, now I've also remembered that atof will be called only for strings which have their size not bigger than 10 (or 11 if minust stands first). So in the extreme case I'll get 9999999999 - AFAIK atof should work fine with such things.

This topic is closed to new replies.

Advertisement