7203

61
+16  Q:

Language showdown: Convert string of digits to array of integers?

I was trying to convert a string containing only base 10 digits (e.g. "124890") to an array of corresponding integers (for given example: [1, 2, 4, 8, 9, 0]), in Ruby.

I'm curious about how easily this can be accomplished in Ruby and in other languages.

+54  A:

Python:

``````[int(c) for c in s]
``````

or

``````map(int, s)
``````

Note: In Python 3 (and later), the list-like object returned by `map()` is not subscriptable. If you need this capability, just convert it to a list/tuple using `list(map())` or `tuple(map())`.

definitely the most readable solution given.
Hard to beat, indeed.
[for c in s -> int(c)] // in F# - both F# and Python are inspired by the same idea in Haskell, so that's probably not a surprise.
nailed em :) :) :)
I note that this is the most upvoted by far; probably because it seems short, but it's worth noting that, as mentioned below, in other languages (c, c#, c++, probably more), it's already done. The other posters have posted longer ways of *retrieving* the values (which this doesn't show).
Python is good for brevity, but my upvote is for the readability.
really python is ordinary language
+6  A:

Here's my tentative in Ruby :

```"124890".split(//).map {|chr| chr.to_i}
=> [1, 2, 4, 8, 9, 0]```

This splits the string using a regex that matches a zero-length string so each character is an element of the array, them maps each (one-character string) element is converted to its integer value.

+2  A:
``````>> "124890".split("").map{ |i| i.to_i}
=> [1, 2, 4, 8, 9, 0]
``````
A:

In C#

``````string nums = "124890";
int[] vals = new int[nums.Length];
int i = 0;
int offset = Convert.ToInt32('0');
foreach(char d in nums)
vals[i++] = Convert.ToInt32(d) - offset;;
``````

Updated: Fixed algorith, as noted in comments.

Obviously, this is not quite correct... as it gives us an array of ascii codes...
A little slower next time.
+6  A:

PHP: `\$string = preg_split("//", "1234567890");`

That should work. Although I'm not sure if you need the regex delimiters for an empty string in preg_split. I do know that explode will NOT work as if you try to explode on the empty string, explode() returns false.

Because Mr Wiseass has judged that his solution was better. I think there is a conflict of interest when he can vote down answers in a question he's answered himself.
This will give you an array of strings, not an array of integers.
+6  A:
``````<?php
\$array = str_split("124890");
?>
``````

It's hard to beat a built-in method :)

Not even needed, strings ARE arrays. See my code blow.
Why was this voted down? Like my code, it indeed works. Just because there is more than one way does not mean this deserves a downvote.
I think I can guess pretty easily who voted this down.
It could be getting downvotes because the question asks for an array of integers and this answer gives an array of strings. Try adding two of the items in the resulting array together and you end up with string concatenation.
I beg to differ:<?php \$array = str_split("124890"); echo \$array + \$array;?>The above outputs 6. Welcome to loosely-typed languages.
@Gilles I *totally* forgot that PHP has a separate operator for concatenation. The scars don't go that deep, after all :)
+2  A:

Here's a one-liner in C#, without the fancy => stuff.

``````Array.ConvertAll<char, int>(s.ToCharArray(), new Converter<char, int>(delegate(char c) { return int.Parse(c.ToString()); }));
``````

That's a crappy solution, as noted. Now I've just tried to make it worse (without adding too many meaningless function calls). Here's a one-liner with type safety!

``````Array.ConvertAll<char, int>(Enumerable.TakeWhile<char>(s.ToCharArray().AsEnumerable<char>(), new Func<char,bool>(delegate(char c) { int i; return int.TryParse(c.ToString(), out i); })).ToArray<char>(), new Converter<char, int>(delegate(char c) { return int.Parse(c.ToString()); }));
``````
You can't int.Parse(c) because c is a char and int.Parse takes a string only.
K, it works now.
Wow. Even more verbose than the Java version. That's quite an accomplishment.
Noted. I've improved upon it's badness.
I don't really see the point of making it a "one-liner" since it becomes very hard to read.
Damn near impossible to read I'd say. I'm just doing it for fun now.
omg - have you ever heard of Linq?
+10  A:

In Ruby:

`````` s.scan(/\d/).map { |c| c.to_i }
``````
A:

No code needed in PHP! You can access a string as an array and if you use the number in a numeric way then an implicit typecast will be done for you. Example:

``````\$foo = "01234";

echo 4 - \$foo;  // echos 2
``````
The square brackets are a shortcut to access PHP strings by character. It does not make them an array automatically, it is not an implicit typecast. Look up "PHP string access by character".
Thanks for clarifying that Gilles.
A little piece of proof:array_diff("1234", "12345");Will give you an error because the parameters aren't arrays. No implicit typecast here.
does 'echo \$foo + 4' print 6?
+5  A:

In Tcl:

``````% split "124890" ""
1 2 4 8 9 0
``````
+1 beat me to it.
+2  A:

In C:

``````void convert (int * array, char * string)
{
int i=0;

while (*string)
array[i++] = *(string++)-'0';
}
``````
If you're not going to allocation array yourself, then you'd better accept a maximum length parameter. Google "Buffer Overrun Attacks"
Man, showing a C sample without the allocations just isn't fair :)
how about *(array++) = *(string++) - '0'; :)
@EvilTeach - the parens are unnecessary (which I'm only pointing out since you seem to be going for the most concise code possible)
Hey Adam, isn't *string++ equal to *(string)++, which would be the wrong thing then?
True - the parens are not required, but I consider them as good code-style in this case (e.g. dereferencing and incrementing a pointer in one go).
+2  A:

Just like James Curran's in C# but with one less line

``````string nums = "124890";
int[] vals = new int[nums.Length];
for (int i = 0; i < vals.Length; i++)
vals[i] = Convert.ToInt32(nums[i]);
``````
+3  A:

In Java:

``````String s = "12345";
int[] nums = new int[s.length()];
for(int i = 0; i < nums.length; i++)
nums[i] = s.charAt(i) - '0';
``````
+1 cause I actually needed this to convert phone numbers and was too lazy to write it myself :-D
+7  A:

Aaah, the sweetness of Linq. Compared to the other .NET implementations, you can see why I love it so...

``````int[] foo = (from x in "1234567890" select int.Parse(x.ToString())).ToArray();
``````

For S&G's here's the fluent-style extension method version of this:

``````"1234567890"
.ToCharArray()
.Select(x => int.Parse(x.ToString()))
.ToArray();
``````
I think you can skip .ToCharArray() since string implements IEnumerable<char>.
compare to the python and you'll see why I'm glad I'm not writing .NET code anymore...
A:

in Lua:

``````a = {}
string.gsub("1234","%d",function (d) a[#a+1] = tonumber(d) end)
``````
+4  A:

In Delphi:

``````type
TIntArray = array of Integer;

function ConvertStringToIntArray(const aString: String): TIntArray;
var i: Integer;
begin
SetLength(Result, Length(aString));
for i := 1 to Length(aString) do
Result[i-1] := ord(aString[i]) -48;
end;
``````
In the line before last, I would have gone with StrToInt(aString[i]), just for clarity (the -48 business is handy and no doubt faster but not quite as readable).
You should consider saving the `Length(aString)` into a separate integer variable. Also write the loop from `0` to `Length(aString) - 1`. Comparison to 0 is slightly faster. Also addition is faster than substraction
+11  A:

``````map digitToInt
``````
That works on a list of strings, not a string of digits as the poster asked.
As Hugh says, it's not quite there... "map \$ read . (:[]) :: String -> [Int]" does the right thing.
Pretty funny that this oh-so-easy Haskell solution turns out to look very mesy :) no offense, I like Haskell!
fixed. and no, it's not messy at all, when done correctly.
I <3 point free programming! +1
+5  A:

F#

``````"123456789" |> Seq.map (string >> int)
``````
+12  A:

C#:

Linq really makes this task easy and Will's version can even be cut further.

``````from c in str select (int)(c - '0');
``````

Or, using the extension style:

``````str.Select(c => (int)(c - '0'));
``````

Or multithread it (I'll bet this is less code than any other language):

``````str.AsParallel().Select(c => (int)(c - '0'));
``````

KISS!

Hey, C# is explicit. While such things as 'ToString" are, at an initial glance, overly wordy, at least they're self-descriptive. More than "to_s", or similar.
I've got nothing against methods like `ToString`, where applicable (I actually use it quite a lot, even in cases of string concatenation because I don't like the overloaded `+` operator for strings and numbers). Unnecessary in the above code, however.
You still need .ToArray() at the end to meet the requirements :)
@Lucas: not really. Most (interesting) examples shown here don't return an actual array but rather any array-like structure. And this is a *good* thing because we code to an interface, not an implementation, right?
Konrad - You could have at least bragged about the multicore functionality in plinq!
Thanks for the PLinq addition. But you underestimate other languages’ ability to multithread higher order functions. Actually, purely functional languages like Haskell should be able to multithread list mapping code automatically.
+5  A:

Here's a C++ version.

``````std::vector<int> result;
for (const char *digit = "124890";  *digit;  ++digit)
result.push_back(*digit - '0');
``````
Mark, shouldn't `result` be a `vector<int>`?
Oh, something else: This code has quadratic runtime because `strlen` is called in each iteration.
I realized my mistake on vector<int> and fixed it before I saw your comment. And no, strlen is not called for each iteration - the initializer copies it to a local called "count", only once.
Yes, it only calls strlen once. But why bother calling it at all?for (size_t i = 0; digits[i]; ++i) result.push_back(digits[i] - '0');
You should use const char* with string literals.
I have made it more compact using pointer arithmetics.
Could these modifications be one reason why people are moving away from C++? (It seems too complex!)
+3  A:

Here's another C++ version in a more modern style. I'm not sure it's an improvement.

``````int ConvertDigit(char ch)
{
return ch - '0';
}

std::string digits("124890");
std::vector<int> result;
result.resize(digits.size());
std::transform(digits.begin(), digits.end(), result.begin(), ConvertDigit);
``````
A:

In VB.NET:

```  Dim source As String = "123458796"
Dim data() As Char = source.ToCharArray
Dim res(source.length - 1) As Integer

for i as integer = 0 to Source.length - 1
res(i) = cint(microsoft.visualbasic.val(data(i)))
Next
```
Uuh. No offence, but this needs refactoring. Have a look at the C# solutions using one line.
.net 2.0, but you are right.
A:

One way to do it in Icon:

``````thelist := [] ; every put(thelist, !"124890" + 0)
# thelist is now [1, 2, 4, 8, 9, 0]
``````

Although if you don't really need a list of ints you can just index (from 1) the chars of the string as if they were ints - Icon is dynamically typed:

``````write(4 - "014890") # prints 3
``````
A:

It's not very succinct in VB6, but more so than VB.NET :)

``````foo\$ = "124890"
ReDim thelist&(Len(foo\$))
For i& = 1 To Len(foo\$)
thelist(i&) = Mid(foo\$, i&, 1)
Next
``````
+10  A:

In C I wouldn't bother. :-) The "string" is already an "array" of characters, so you can iterate and manipulate it like any other array. To get the int value of a single character, just subtract '0' from the char value:

``````const char *myString = "12345";
int i;
for (i = 0; myString[i] != '\0'; i++) {
int myIntVal = myString[i] - '0';
printf("Integer: %d\n", myIntVal);
}
``````
It's more readable if you use a character literal '0' instead of 48.
Another reason for using '0': On an EBCDIC machine, the value of '0' is 240, not 48.
Of course! Thanks for the pointer. ;-)
Nobody, and I mean NOBODY uses EBCDIC anymore. It has annoying things like 'i'+1 not equaling 'j'.
There is still a lot of EBCDIC code running out there! And even new code being written on EBCDIC machines ...
I took the liberty of actually editing the code to reflect that Toony seemed to agree that a literal '0' is better than the hardcoded ASCII value.
+3  A:

Ruby:

``````"123".bytes.collect { |d| d - "0" }
``````

C++, works for std::string and null terminated strings:

``````#include <vector>
#include <boost/foreach.hpp>

template<typename T>
std::vector<int> doit(const T& s)
{
std::vector<int> retval;
BOOST_FOREACH (char c, s) retval.push_back(c - '0');
return retval;
}
``````
Note that String#bytes is not available in Ruby versions prior to 1.9...
It is present in 1.8:irb(main):001:0> RUBY_VERSION=> "1.8.7"irb(main):002:0> "123".bytes.collect { |d| d - "0" }=> [1, 2, 3]
A:

Fortran 90 allows you to use strings instead of file handles in read/write operations, which lets you do things like this:

``````subroutine convert(string, intArray)
character(len=*),      intent(in)  :: string
integer, dimension(:), intent(out) :: intArray
integer :: ii
do ii=1,len(string)
end do
end subroutine convert
``````

Note that to get around the allocation/buffer over-run problem, I am assuming that I'm compiling this with array bounds checking enabled.

A:

T-SQL doesn't have arrays, but you could represent it as a table

Here's a looping routine:

``````declare @s varchar(1000); set @s='124890';
declare @t table(i int)
declare @i int; set @i=0
while @i<len(@s) begin
set @[email protected]+1;
insert @t(i) values(convert(int,substring(@s,@i,1)))
end
select * from @t
``````

Here's a direct set-based version, but it relies on having an numbers set:

``````select convert(int,substring('124890',i,1)) as i from (
select 1 as i union select 2 union select 3 union select 4 union select 5 union select 6
) j
``````

you could also use my SQL range function to do it like this:

``````select convert(int,substring('124890',n,1)) as i from dbo.Range(1,len('124890'),1)
``````
+5  A:

In Ruby 1.9, or on 1.8 if you are in Rails/have the symbol_to_proc gem, it becomes

``````"124890".split('').map(&:to_i) #=> [1, 2, 4, 8, 9, 0]
``````
+3  A:

Here's a Standard ML version using function composition (with output from SML/NJ):

``````- val digitsToString = (map (valOf o Int.fromString o Char.toString)) o explode;
val digitsToString = fn : string -> int list
- digitsToString "124890";
val it = [1,2,4,8,9,0] : int list
``````

How does it work? The types tell almost the whole story:

``````- explode;
val it = fn : string -> char list
- map;
val it = fn : ('a -> 'b) -> 'a list -> 'b list
- Char.toString;
val it = fn : char -> string
- Int.fromString;
val it = fn : string -> int option
- valOf;
val it = fn : 'a option -> 'a
- valOf o Int.fromString o Char.toString;
val it = fn : char -> int
``````
+5  A:

Lisp:

``````(loop for i across "12345" collect (digit-char-p i))
``````
+10  A:

In x86 assembly, if we have a null-terminated input and a correctly-sized, pre-allocated output, we can do:

`````` xor eax,eax
mov esi,dword ptr [source]
mov edi,dword ptr [dest]
_loop:
mov al,[esi]
test eax,0
jz _done
and al,0fh
mov [edi],eax
jmp _loop
_done:
``````

Sure it is a lot of lines, but the whole thing is 29 bytes, so that is pretty small.

+1  A:

Groovy:

``````def s = '124890'
def list = []

s.each{ list << Integer.parseInt(it) }
``````
Didn't see you solution when posting mine; I do however think that the 'collect' solution I used is a bit more concise in matters of scoping and number of characters.
I agree p3t0r that yours is cooler... very nice
+3  A:

Fast C implementation that does in-place replacement (since char is both a character and a number (byte)):

``````void toNumbers(char *digits) {
int len = strlen(digits);
char *end = digits + len;
char *next;

// Process 4 characters at a time
while ((next = digits + 4) <= end) {
*((long *)digits) -= 0x30303030;
digits = next;
}

// Handle remaining characters
switch (len & 3) {
case 3:
*((short *)digits) -= 0x3030;
digits += 2;
case 1:
*digits -= 0x30;
break;
case 2:
*((short *)digits) -= 0x3030;
}
}
``````
A:

JavaScript 1.6:

``````str.split("").map(function (d) { return parseInt(d); })
``````

Update:

Or more concisely:

``````str.split("").map(Number)
``````
A:

In Groovy:

``````"12345678".collect{it.toInteger()}
``````
A:

Scheme:

``````(map
(lambda (c)
(- (char->integer c) (char->integer #\0)))
(string->list "12345"))
``````
A:

In XQuery:

``````for \$c in string-to-codepoints("124890")
return xs:integer(codepoints-to-string(\$c))
``````
A:

In Icon:

``````
a:=[]
every put(a,integer(!s))
return a
``````
A:

In Oz:

``````{Browse {Map "123456789" fun{\$ X} X - &0 end}}
``````
+4  A:

Two versions in JavaScript:

``````myString.split("").map(Number)
``````

A second version which is more concise when you amortise it over the entire length of your code base:

``````// During initialisation of your app
String.prototype.map = Array.prototype.map
...
// Later
myString.map(Number)
``````
Passing the Number to the map() function is a nice trick! Thumbs up!
So brutal though -- Ignoring entirely that JS will do a toNumber conversion on each character in math -- i decided not to suggest relying on that though because that does not work for addition. Eg. 2*"2"==4, but 2+"2"=="22" which could result in.. interesting behaviour :D
A:

In Erlang (since strings are lists of integers) you only need to go from ASCII-value to integer value by subtracting the ASCII value of '0'. In its shortest form:

```lists:map( fun(X) -> X-\$0 end, "1234").
```

If you want a check for validity preventing conversion of non-base-10 characters and make it convenient to use:

`````` Convert = fun(L) -> lists:map( fun(X) when is_integer(X), X>=\$0, X=<\$9 -> X-\$0 end, L) end.
``````

Use as:

```74> Convert("10823472").
[1,0,8,2,3,4,7,2]
```

Now for something invalid:

```75> Convert("aap").

=ERROR REPORT==== 8-Oct-2008::12:12:36 ===
Error in process  with exit value: {function_clause,[{erl_eval,'-inside-a-shell-fun-',"a"},{erl_eval,expr,3}]}

** exited: {function_clause,[{erl_eval,'-inside-a-shell-fun-',"a"},
{erl_eval,expr,3}]} **
```

(if it fails, it fails big ;))

A:

In idiomatic REBOL it's

``````nums: copy []
forall c "124890" [
append nums to-integer to-string c
]
``````

However, one can easily create a map function in REBOL and use it for this purpose:

``````map: func [f [any-function!] s [series!] /local result] [
result: copy []
forall s [
append/only result f first s
]
result
]

map func [c] [to-integer to-string c] "124890"
``````

I have to admit that this is not one of REBOL's strong points, although more advanced Rebollers than I may know an even more succinct way.

+1  A:

In Clojure:

``````(map #(new Integer (str %1)) "124890")
``````

Oh, and while we're at it, twk asked for a multi-threaded version above. Here's one:

``````(pmap #(new Integer (str %1)) "124890")
``````
A:

Yeah I think Ruby's is pretty:

'124890'.scan(/\d/).map(&:to_i)

A:

In Scheme:

``````(map (compose string->number string) (string->list "12345"))
``````
+7  A:

In Perl, assuming valid input in `\$str`:

``````@digits = split //, \$str;
``````

This works because Perl, like PHP, unifies string and numeric representation.

PHP actually has separate integer and string types.
+1  A:

I bet I can multi-thread mine in less lines than your language does....

``````var nums = str.Select(c => (int)(c - '0'));
``````

And here it is taking advantage of all cores:

``````var nums = str.AsParallel().Select(c => (int)(c - '0'));
``````

pwned.

:)

+8  A:

Do you really need such high-falutin' tools for such a simple task?

Commodore-64's BASIC.

``````10 REM CONVERT STRING, S\$, INTO ARRAY OF INTEGERS, A%
20 S\$ = "02489"
30 L = LEN(S\$)
40 DIM A%(L)
50 FOR X = 1 TO L
60 A%(X) = INT(MID\$(S\$, X, 1))
70 NEXT X
``````

Untested and from memory.

(Stack Overflow's Syntax Highlighting doesn't have a C-64 BASIC mode? For shame!)

You forgot 80 PRINT "BOOBIES" 90 GOTO 80
no, I would add 80 sys 64738
+1 for the tears in my eyesPoke53280,0:Poke53281,0:Poke646,1
A:

I'll bite...

Here it is in Scala:

``````digits.map(_.toInt)
``````

And the concurrent form using Debasish's `pmap` implementation plus a little implicit conversion magic:

``````digits.pmap(_.toInt)
``````
A:

Smalltalk is still alive ;-)

``````'124890' asArray collect:[:c | c digitValue]
``````

alternative:

``````'124890' asArray map:#digitValue
``````
+1  A:

Mathematica:

``````ToCharacterCode["01234"] - 48
``````

(48 is the character code for "0")

This works because `ToCharacterCode` returns a list of the character codes for each character in the string and then subtracting a number from a list subtracts the number from each element of the list.

+3  A:

PowerShell:

"0123456789".ToCharArray() | %{\$_-48}

As with other examples, 48 is the character code for '0'.

A:

Several ways to do it in FORTRAN. One of the easiest to understand is EQUIVALENCE.

``````INTEGER * 4 strTest(20) ! This is the string as an array of 4 byte integers
BYTE bTest(80) ! This is an array of bytes
EQUIVALENCE (strTest(1), bTest(1))
``````

All character strings in FORTRAN may be represented as an array of values. The easiest ones to deal with are INTEGER*2, INTEGER*4 AND LOGICAL*1. Almost all of the FORTRAN compilers will deal with all of these. There are some exceptions for the LOGICAL. The code displayed is for a DEC machine. This code will work on everything from the PDP machines to the DEC Alpha. BYTE is a DEC specific data type. LOGICAL*1 will work on most machines, with the notable exception being IBM since it's not byte addressable.

+1  A:

4 characters in J:

``````"."0
``````

i.e.

``````   "."0 '0123874'
0 1 2 3 8 7 4
``````

J will never get points for readability, however.

A:

Common Lisp (`string` holding the string of digit characters):

``````(map 'vector #'digit-char-p string)
``````
+1  A:

Bash:

``````s="124890"
for i in \$(seq 0 1 \$((\${#s}-1)))
do
arr[\$i]=\${s:\$i:1}
done
``````
A:

ActionScript 3.0:

``````var string:String = "01234";
var array:Vector.<int> = new Vector.<int>()
var i:int = 0;
while (i < string.length)
{
array.push(string.slice(i, ++i));
}
``````
A:

Another one In c#

``````    string str = "124890";
var res = str.ToCharArray().Select(x => Convert.ToInt32(((char)x).ToString())).ToArray();
``````

Cheers

Ramesh Vel

A:

IA32

Using the same assumptions about the input that DocMax made, the code can be reduced to:

``````    mov esi,source
mov edi,dest
l1:
movzx eax,byte ptr [esi]
inc esi
sub al,'0'
jc done
stosd
jmp l1
done:
``````

which is 17 bytes.

Skizz

+1  A:

C# & VB.net using linq...

``````var result = "123456789".ToCharArray().Select(
c => int.Parse(c.ToString())).ToArray();
``````

or

``````Dim result = "123456789".ToCharArray().Select(
Function(c) Integer.Parse(c.ToString())).ToArray()
``````