tags:

views:

67

answers:

2

I have 5 files file1.txt file2.txt....file5.txt then I have list of 3 words red white blue

I am trying to find out how many times and in what files red white blue occur.

Finally the format should be:

red = file1.txt, file3.txt, 2
white = file2.txt, 1
blue = file1.txt, file2.txt, file3.txt, 3

This is what I have so far:

files.each do |i|
    curfile = File.new("#{i}","r")
    while (line = curfile.gets)
        mywords.each do |j|
           if (line ~= /\b#{j}\b/)
               ##what kind of data structure should I put the results in??
           end
        end
    end
end

What kind of data structure should I put the results in?

A: 
results = {}
%w(red white blue).each do |word|
  results[word] = Hash.new(0)
  %w(file1.txt file2.txt file3.txt file4.txt file5.txt).each do |file|
    scanner = StringScanner.new(File.read(file))
    while (scanner.scan_until(/\b#{word}\b/)) do
      results[word][file] += 1
    end
  end
end

This will return a hash where the keys are the colors and the values are hashes of filenames and the number of matches in each file:

{'red' => {'file1.txt' => 1, 'file2.txt' => 2}, 'blue' => {'file1.txt' => 1}} 
Rida Al Barazi
It may be possible to give `results` auto-vivification, such that you don't need `results[word] = Hash.new(0)`.
Andrew Grimm
Yea, I believe it's possible to do `results = Hash.new(Hash.new(0))`.
Rida Al Barazi
+1  A: 

I was able to do this with following code:

mystring = ""
colors = %w{red white blue}
final_list = Arrays.new{colors.size}
final_list.each_with_index do |thing,index|
    final_list[index] = ""
end
files.each do |i|
    File.open("#{i}","r") { |f|
       mystring = f.read
    }
    colors.each_with_index do |thing,index|
       pattern = /#{thing}/i
       if (mystring =~ pattern)
           final_list[index] = final_list[index] + i + " "
       end
    end
end

colors.each_with_index do |thing,index|
    list = final_list[index].split (" ")
    puts "#{thing} (#{list.size})= #{list.join(',')}"
end
learn_plsql