tags:

views:

86

answers:

1

I'm trying to read files and create a hashmap of the contents, but I'm having trouble at the parsing step. An example of the text file is

put 3
returns 3
between
 3
pargraphs 1
4
 3
#foo 18
****** 2

The word becomes the key and the number is the value. Notice that the spacing is fairly erratic. The word isn't always a word (which doesn't get picked up by /\w+/) and the number associated with that word isn't always on the same line. This is why I'm calling it not well-formed. If there were one word and one number on one line, I could just split it, but unfortunately, this isn't the case. I'm trying to create a hashmap like this.

{"put"=>3, "#foo"=>18, "returns"=>3, "paragraphs"=>1, "******"=>2, "4"=>3, "between"=>3}

Coming from Java, it's fairly easy. Using Scanner I could just use scanner.next() for the next key and scanner.nextInt() for the number associated with it. I'm not quite sure how to do this in Ruby when it seems I have to use regular expressions for everything.

+2  A: 

I'd recommend just using split, as in:

h = Hash[*s.split]

where s is your text (eg s = open('filename').read. Believe it or not, this will give you precisely what you're after.

EDIT: I realized you wanted the values as integers. You can add that as follows:

h.each{|k,v| h[k] = v.to_i}
Peter
Brilliant! I didn't know about the splat operator (*) to perform something like that. Thanks!
Marcos