ansaurus

Question

Please explain what this code is doing (someChar - 48)

Answer 1

+25 A:

The code basically sums the digits of a number represented as a string. It makes two important assumptions to work properly:

The string contains only chars in the '0'..'9' range
The character encoding used is ASCII

In ASCII, '0' == 48, '1' == 49, and so on. Thus, '0' - 48 == 0, '1' - 48 == 1, and so on. That is, subtracting by 48 translates the char values '0'..'9' to the int values 0..9.

Thus, precisely because '0' == 48, the code will also work with:

sum += s[i] - '0';

The intention is perhaps slightly more clear in this version.

You can of course do the "reverse" mapping by addition, e.g. 5 + '0' == '5'. Similarly, if you have a char containing a letter in 'A'..'Z' range, you can "subtract" 'A' from it to get the index of that letter in the 0..25 range.

On alternative encodings

As mentioned, the original - 48 code assumes that the character encoding used is ASCII. - '0' not only improves readability, but also waives the ASCII assumption, and will work with any encoding, as specified by the C language which stipulates that digit characters must be encoded sequentially in a contiguous block.

On the other hand, no such stipulation is made about letters. Thus, in the rare situation where you're using EBCDIC encoding, for example, mapping 'A'..'Z' to 0..25 is no longer as simple as subtracting 'A', due to the fact that letters are NOT encoded sequentially in a contiguous block in EBCDIC.

Some programming languages simplify matters by mandating one particular encoding is used to represent the source code (e.g. Java uses Unicode: JLS §3.1)

Related questions

Are digits represented in sequence in all text encodings?

polygenelubricants 2010-07-07 13:20:59

+1 for pointing out that `'0'` would be a much clearer way of writing `48` in this context. FWIW, using `'0'` would make it work on EBCDIC crap too (C guarantees successive char values for the numerals); however, your `'A'` example will not work with EBCDIC.

R.. 2010-07-07 13:56:20

The reason these tricks work is because ascii codes are ordered. That is, '0' to '9' use consecutive ascii codes (as do 'a' to 'z' and 'A' to 'Z'). See http://www.cdrummond.qc.ca/cegep/informat/Professeurs/Alain/files/ascii.htm for a decent ascii chart.

Brian 2010-07-07 14:30:21

@R.. how can C guarantee successive char values for numerals?! it just happens that ASCII, as EBCDIC, has this properties. If a system would use another encoding without this property, the code can not work. The fact that such systems do not exist, does not invalidate the logic.

ShinTakezou 2010-07-07 19:16:43

@ShinTakezou, any character encoding without this property is not an allowable encoding for a C implementation. Or said differently, a self-purported "C implementation" using such an encoding would not be "C". Read the standard. Anyway wait a decade or two and ISO C will finally just go ahead and specify ASCII or (with some luck) UTF-8.

R.. 2010-07-07 20:28:06

@R.. interesting, thanks. (once I found here the link to std, but now I have it no more, and anyway hardly I will read it, since I believe it must be read just by guys that have to make a compiler compliant, while me, as hobbyst programmer, trust `-std=c99` or alike...) I am not aware of any encoding having no the property... however if it would exist a system using such an encoding, it is interesting to note that such a system can't have a fully-standard compliant C compiler!

ShinTakezou 2010-07-09 15:37:40

Answer 2

+4 A:

Finding the sum of the numbers in the string s.

The sum += s[i] - 48; converts ASCII characters to their numeric values.

Adam Shiemke 2010-07-07 13:21:25

Answer 3

+2 A:

Hi, its adding up 3 + 5 + 7, and then printing

Sum is 15

The -48 part is that it is subtracting the character 0, that is, the ascii value for 0.

So what it does is

'3' - '0' > 51 - 48
'5' - '0' > 53 - 48
'7' - '0' > 55 - 48

As you can see, in C, '0' (character zero) is different from 0 (number 0). They have different values (amongst other things)

Tom 2010-07-07 13:21:31

Answer 4

+1 A:

I suggest writing a test program to see what the values of s[] display. You might also print out all the values for each entry in "0123456789".

I think you'll quickly realize what it's doing, although this code is relying on ASCII encoding.

Have fun!

holtavolt 2010-07-07 13:23:55

You said "relying" -- is this code not guaranteed to work in some C/C++ implementation? In Java, it's guaranteed that `'0' == 48`.

polygenelubricants 2010-07-07 13:37:22

C still allows arbitrary character encoding (the only restriction being that the numerals must have consecutive values, i.e. `'1'='0'+1` etc.) because of grandfathered-in EBCDIC crap. Any modern real-world system will be using ASCII or an ASCII superset (ideally UTF-8 but possibly ISO-8859-*, KOI-*, etc.)

R.. 2010-07-07 13:54:01

ansaurus

tags:

views:

answers:

Please explain what this code is doing (someChar - 48)

See also

Related questions

On alternative encodings

See also

Related questions

related questions