ansaurus

Question

Regex use in reading file .txt and returning information to screen

Answer 1

+1 A:

First of all, in that example data you listed, it appears that there aren't any lines which contain "CLOWN" and "112". I'm going to assume for the rest of this answer that the course number you are interested in is "110".

This line appears to be your problem:

line.scan~(/department/&&/classnumber/)

A useful debugging tool is to try and reduce your problem to a small test case. In Ruby and other scripting languages, it can be helpful to play with that test case in an interactive shell like irb. Let's try that in irb, with some mockup data so our variables are defined:

>> department = "CLOWN"
=> "CLOWN"
>> classnumber = "110"
=> "110"
>> line = "342 1936 CLOWN 110 ON HD CLOWN MAKE-CLASS 5.0 5.0 KRUSTY 798 MTWTh 7:30A 8:30A 24 13 11 4.3"
=> "342 1936 CLOWN 110 ON HD CLOWN MAKE-CLASS 5.0 5.0 KRUSTY 798 MTWTh 7:30A 8:30A 24 13 11 4.3"
>> line.scan~(/department/&&/classnumber/)
TypeError: wrong argument type nil (expected Regexp)
 from (irb):4:in `scan'
 from (irb):4
 from :0

OK, so there are a few problems. The first is that scan~ is not valid syntax; the method is just scan:

>> line.scan(/department/&&/classnumber/)
=> []

Hmm. Not an error this time, but still no result. Lets see what the components of that are doing. What we're doing in this line is computing /department/&&/classnumber/, and then passing the result of that to the scan method on the line string.

>> /department/&&/classnumber/
=> /classnumber/

Interesting. That just gives us the second regular expression that we passed in. Why is that? Well, the && operator takes two expressions. It computes the first expression. If that is false, it returns false. If it is true, it computes the second expression. If that is false, it returns false. If that is true, it returns the second expression. Now, every value in ruby except for false and nil is treated as if it were true. So, since these two regular expression are not false or nil, they are both treated as true, and the result of this expression is the second component, /classnumber/.

But even given that the first regular expression is being ignored, and only the second is being used, why doesn't this work?

>> line.scan(/classnumber/)
=> []

When you write the regular expression /classnumber/, you are looking for the literal characters classnumber in your string. For instance:

>> "string containing classnumber".scan(/classnumber/)
=> ["classnumber"]

What you want to be looking for, however, is the value of the variable classnumber. There are a couple of ways to go about this. You could just pass that string in to scan:

>> line.scan(classnumber)
=> ["110"]

Or, you can build a regular expression by interpolating your classnumber variable into it:

>> line.scan(/#{classnumber}/)
=> ["110"]

Now, you have something working. But you still want to match against the department too. How can you combine the two? You could just interpolate them into the same regexp:

>> line.scan(/#{department} #{classnumber}/)
=> ["CLOWN 110"]

Note that I add a space in the middle to match the space between department and course number in the input. Depending on your data format, you may want this to be /#{department} +#{classnumber}/ to indicate “one or more spaces,” or /#{department}.*#{classnumber}/ to indicate “any number of any character;” you'll have to make that call yourself.

Oh, and if you want to be getting the whole line, you're going to need to add something to match the text before and after the department and class number:

>> line.scan(/.*#{department} #{classnumber}.*/)
=> ["342 1936 CLOWN 110 ON HD CLOWN MAKE-CLASS 5.0 5.0 KRUSTY 798 MTWTh 7:30A 8:30A 24 13 11 4.3"]

Anyhow, I think that's about it. You can now match against the department and class number that have been input; and if you followed the steps I used to deconstruct your problem, you might be able to use a similar technique to isolate and solve problems in the future.

Brian Campbell 2009-11-22 07:49:54

Thanks for the IRB explination, good stuff. I tried IRB but went in the wrong direction.

Matt 2009-11-22 20:16:16

Answer 2

A:

I am not sure, but you probably want

File.foreach('enrollment.txt') do |line| puts line.scan(/.*#{department} #{classnumber}:.*/)

Edit: There are other problems in your code (class_number vs classnumber) and it is not very ruby-ish... Try this

#!/usr/bin/ruby -w
# vim: set fileencoding=utf-8 :

loop do
  puts "Type search if you want to look for class information, exit to exit"
  case gets.to_s.chomp
  when /search/i
    puts 'Enter the 3 letter department you are looking for!(Example: CLOWN)'
    department = gets.to_s.chomp
    puts "You have entered #{department}"

    puts "Enter the 3 digit class number you are looking for!(Example: 112)"
    class_number = gets.to_i
    puts "You have entered #{class_number}"

    File.new('enrollment.txt').readlines.each do |l|
      puts l if l =~ /#{department}/ &&  l =~ /#{class_number}/
    end
  when /exit/i
    puts "Exiting"
    break
  else 
    puts 'Command not understood'
  end
end

anshul 2009-11-22 07:51:41

Matt 2009-11-22 20:17:59

ansaurus

tags:

views:

answers:

Regex use in reading file .txt and returning information to screen

related questions