atoi & co

Started by
18 comments, last by joe_bubamara 14 years, 8 months ago
I am doing little .obj parse in C (not C++) and have to write some code to parse text data. Main parsing task is parsing actual numbers for coordinates, colors etc. I am using scanf & friens for main part of the job, but I am considering using atoi and atof instead. However those functions returns 0 if no conversion is performed??? How are you then supposed to know when an error occured and when a value of 0 is found? If I did't care about syntax and file validity checking, this would be fine, but I do. The documentation does not mention anything about error number being sett.
Advertisement
You could use strtol(..). It can return (via the second paramater) a pointer to where ever it stopped parsing. You can use that to check the entire input string was converted.

Alan
"There will come a time when you believe everything is finished. That will be the beginning." -Louis L'Amour
Or something obvious like:
if((result = atoi(str)) == 0 && *str >= '0' && *str <= '9')
handle_error();

That's not pretty, but it should work just fine. If there is at least one numeric digit, atoi() will parse some value, it won't fail. So, zero means zero, not failure.
Quote:Original post by Alan Kemp
You could use strtol(..). It can return (via the second paramater) a pointer to where ever it stopped parsing. You can use that to check the entire input string was converted.
You may also want to check whether the input was in range, e.g. strtol clips very small and very large values to LONG_MIN/LONG_MAX and sets errno to ERANGE to indicate the error.

E.g. something along these lines:
bool safe_atoi(const char *in, long *out) { char *end; errno = 0; *out = strtol(in, &end, 10); return end != in && !errno;}


Quote:Original post by samoth
Or something obvious like:
if((result = atoi(str)) == 0 && *str >= '0' && *str <= '9')
handle_error();

That's not pretty, but it should work just fine. If there is at least one numeric digit, atoi() will parse some value, it won't fail. So, zero means zero, not failure.
That won't cover negative zero or cases with leading whitespace.
If you can use C++ code, then a more reliable approach IMHO is to use a string stream:

std::string s = "1234";std::istringstream stream(s);int out = 0;stream >> out;if (stream.eof() && !stream.fail()){  // parsed ok, 'out' is set to value}else{  // some kind of error}


This should also work for long/float/double.
Quote:Original post by samoth
Or something obvious like:
if((result = atoi(str)) == 0 && *str >= '0' && *str <= '9')
handle_error();

That's not pretty, but it should work just fine. If there is at least one numeric digit, atoi() will parse some value, it won't fail. So, zero means zero, not failure.


yeah - it was obvous :-) thnks for opening my eyes!

I would just turn the clausuls in if and add extra test for + resp - sign before the number:

if( (*str == '-' || *str == '+')    str ++;}if(*str >= '0' && *str <= '9')        result = atoi(str);else     str --; // back for - or + char


Feels like it should work. I will test it, and will also test if it is anything more efficient then sscanf - I am now done with .obj parser, so if I am going to change all that crap, it has to have some benefits, otherwise I don't care to do it.

Thnks for both you guys for answering and helping me out.
@orangy - yeah I am quite used to streams in C++, but this project has to be in pure ansi C. No C++ allowed, but thks for interest.
Why not write the atoi yourself ? It's not that difficult, and saves you from having to call a stdlib function (which may depend on a couple of system settings, like locales etc.).
I just tried to post a little code snippet myself, but it seems like the forum software swallowed all the plus signs which turned it all into completely guff.
Quote:Original post by joe_bubamara
Feels like it should work. I will test it, and will also test if it is anything more efficient then sscanf - I am now done with .obj parser, so if I am going to change all that crap, it has to have some benefits, otherwise I don't care to do it.
You're still not accounting for whitespace. Whether or not this is a problem depends on how your parser splits up the strings, I suppose.
At any rate I'd strongly suggest making use of strtol instead if you want robust error detection, plus you won't have to manually tokenize the string to find the end of the number. Plus you get hexadecimal and octal literals for free by specifying base zero.

The other option is to do it manually of course: (untested code!)
bool safe_atoi(const char *in, long *out) { char ch; char prefix; unsigned long value; do  ch = isspace(*in++); while(isspace(ch)); prefix = ch; if(prefix == '+' || prefix == '-')  ch = *in++; if(!isdigit(ch))  return false; value = 0UL; do {  unsigned long overflow = value;  ch -= '0';  value *= 10UL;  value += ch;  if(value < overflow)   return false;  ch = *in++; } while(isdigit(ch)); if(prefix == '-')  return (*out = -(long) value) < 0L; else  return (*out = +(long) value) >= 0L;}
Quote:Original post by plastique
Why not write the atoi yourself ? It's not that difficult, and saves you from having to call a stdlib function (which may depend on a couple of system settings, like locales etc.).
I just tried to post a little code snippet myself, but it seems like the forum software swallowed all the plus signs which turned it all into completely guff.


Put your snippets into '[' code ']' and '[' /code ']' tags .... I have to add ' because of forum software, of course you use [ without '. Code tag will preserve tabs, spaces and other things. If you wish to have colorized syntax read the forum faq to see how it is used - I dislike that text area with colorized widget is so small so one sees only few lines of code at a time, so I almost never use it.

Yeah it is good question, where we should stop rolling own things instead of using libraries? :-). I believe that those conversion functions are done in optimized assembly, so that was reason why I am trying to use those rather then own thing.

cheers

Quote:You're still not accounting for whitespace.
Yeah, you are right, but I do eat whitespaces elsewhere in my code. I do use whites as delimiters and they are all eaten until tokens are parsed. Anyway here is a loop to check for whites and I forgott to check for null pointer in snippet above, here is also check to see if end of string is found ('\0' == 0):

while(*str && isspace(*str))        str ++;if( (*str && *str == '-' || *str == '+')    str ++;if(*str && *str >= '0' && *str <= '9')        result = atoi(str);else     str --; // back for - or + char


You don't have to find the end of the token nor to remove leading whitespaces. Itoa will eat itself until first - or + sign or first numeric value, and then convert any nymber of digits until it finds first non digit character. So it means that we don't have to remove trailing characters whatever they are; just be sure that we are throwing in a string that starts with whits or a number (inclusive + and - before the number).

docs: http://www.cplusplus.com/reference/clibrary/cstdlib/atoi/

[Edited by - joe_bubamara on July 23, 2009 8:32:49 AM]

This topic is closed to new replies.

Advertisement