tags:

views:

110

answers:

6

txt = "1aaa5"

Then

txt.split("a") produces [1, "", "", 5] in ruby 1.9. Anyone can explain why? Particularly, why not [1, 5]? Thanks.

A: 

Your delimiter "a" is present 3 times. Try splitting on "aaa" instead.

Beanish
+10  A: 

Because its split on every instance of "a"

  1. 1aaa5 splits into 1 and aa5
  2. aa5 splits into "" and a5
  3. a5 splits into "" and 5

so, 1, "", "", 5

use /a+/ or "aaa" instead

irb(main):002:0> txt.split(/a+/)
=> ["1", "5"]
S.Mark
+3  A: 

Because your delimiter is a and Ruby can't guess that you don't want null entries. Consider your example, but replace the a character with a comma.

txt = "1,,,5"

In my world, I might consider that 4 columns with zero values implied for the middle two. I certainly wouldn't want it to remove the empty entries, because then if there weren't 4, I wouldn't know which ones should be zeros.

Because it can't guess, it returns both the empty "fields" and non-empty "fields" in the array. Use @S.Mark's solution if you need it to omit the empty "fields."

tvanfosson
+2  A: 

The behaviour you're seeing makes sense. When you call string.split("a") you're saying "use 'a' as the delimiter" and give me an array of the values between the delimiters. Between the first 'a' and the second 'a' in txt the value is an empty string; the same goes for the value between the second and third 'a'. That's why you see [1, "", "", 5]

It's as if txt were 1,,,5 and you chose ',' as the delimiter. If someone asked what values are in the list it'd be:

  1. 1
  2. empty
  3. empty
  4. 5
Allen George
+2  A: 

When you call split, you're passing a delimiter which, by nature, is removed from the string when it's split.

Take, for example, the following:

s = ",foo"

When you call s.split(","), you're saying "Take everything on the left side of the comma and put it in it's own array entry, then take everything on the right side of the comma and put it in the next entry, ignoring the comma itself". The function sees "everything on the left of the comma" as "", not as nothing.

So your string follows the following pattern:

1aaa5
1, aa5
1, '', a5
1, '', '', 5

Which explains why there are two empty strings, and not just [1,5]

Jeriko
A: 

If you don't specify a pattern to split on, split splits on whitespace. So, in addition to the other solutions, you could do

txt = "1aaa5"
txt.gsub('a',' ').split
=>[1, 5]

(if the text doesn't contain relevant whitespace).

steenslag