tags:

views:

139

answers:

5

If there are non-number characters in a string and you call atoi [I'm assuming wtoi will do the same]. How will atoi treat the string?

Lets say for an example I have the following strings:

  1. "20234543"
  2. "232B"
  3. "B"

I'm sure that 1 will return the integer 20234543. What I'm curious is if 2 will return "232." [Thats what I need to solve my problem]. Also 3 should not return a value. Are these beliefs false? Also... if 2 does act as I believe, how does it handle the e character at the end of the string? [Thats typically used in exponential notation]

A: 

Writing simple code and looking to see what it does is magical and illuminating.

On point #3, it won't return "nothing." It can't. It'll return something, but that something won't be useful to you.

http://www.cplusplus.com/reference/clibrary/cstdlib/atoi/

On success, the function returns the converted integral number as an int value.

If no valid conversion could be performed, a zero value is returned.

If the correct value is out of the range of representable values, INT_MAX or INT_MIN is returned.

dash-tom-bang
I knew it would return either 0 [or a set value] or null. But I wasn't sure. But my question was ... does it convert up to the next non-integer value or what?
monksy
+4  A: 

You can test this sort of thing yourself. I copied the code from the Cplusplus reference site. It looks like your intuition about the first two examples are correct, but the third example returns '0'. 'E' and 'e' are treated just like 'B' is in the second example also.

So the rules are

On success, the function returns the converted integral number as an int value. If no valid conversion could be performed, a zero value is returned. If the correct value is out of the range of representable values, INT_MAX or INT_MIN is returned.

Mike
+2  A: 

atoi reads digits from the string until it can't any more. It stops when it encounters any character that isn't a digit, except whitespace (which it skips) or a '+' or a '-' before it has seen any digits (which it uses to select the appropriate sign for the result). It returns 0 if it saw no digits.

So to answer your specific questions: 1 returns 20234543. 2 returns 232. 3 returns 0. The character 'e' is not whitespace, a digit, '+' or '-' so atoi stops and returns if it encounters that character.

See also here.

moonshadow
+1  A: 

If atoi encounters a non-number character, it returns the number formed up until that point.

pcent
+3  A: 

According to the standard, "The functions atof, atoi, atol, and atoll need not affect the value of the integer expression errno on an error. If the value of the result cannot be represented, the behavior is undefined." (7.20.1, Numeric conversion functions in C99).

So, technically, anything could happen. Even for the first case, since INT_MAX is guaranteed to be at least 32767, and since 20234543 is greater than that, it could fail as well.

For better error checking, use strtol:

const char *s = "232B";
char *eptr;
long value = strtol(s, &eptr, 10); /* 10 is the base */
/* now, value is 232, eptr points to "B" */

s = "20234543";
value = strtol(s, &eptr, 10);

s = "123456789012345";
value = strtol(s, &eptr, 10);
/* If there was no overflow, value will contain 123456789012345,
   otherwise, value will contain LONG_MAX and errno will be ERANGE */

If you need to parse numbers with "e" in them (exponential notation), then you should use strtod. Of course, such numbers are floating-point, and strtod returns double. If you want to make an integer out of it, you can do a conversion after checking for the correct range.

Alok
Fail enough, but according to the MSDN integers are 32bit. http://msdn.microsoft.com/en-us/library/296az74e.aspx
monksy
@steven: it also says "Microsoft Specific" at the top. So if you only care about Microsoft specific code, you are right that you don't need to worry about overflow in the first case. But if you want portability, you need to. Your question wasn't tagged with any platform-specific tag, so I assumed you wanted portability :-).
Alok
Fair enough. Most systems I've written for are 32 bit so thats what I'm used to seeing. [Well the 16bit was a long time ago]
monksy
POSIX requires `sizeof(int)>=4` too.
R..
Just to complete your mentioning of strtol a bit, I find the special parameter for the base `0` the most convenient. This converts the number from the usual bases automatically, in particular with base 10 for *normal* decimal numbers and from hexadecimal if the number starts with `0x`.
Jens Gustedt