tags:

views:

144

answers:

2

It would be helpful to pick up Ruby for my new gig so this morning I wrote the following. It takes a PGN file of chess games I've played and indexes them by first move. I'd appreciate any suggestions as to how to make it more "idiomatic".

Since it doesn't take command line arguments (such as for the filename) and isn't object oriented suggestions along those lines are certainly welcome.

Bear in mind, I'm creating an index of all of the moves (not just first moves) from all of the games, because I'd like to eventually index on more than just the first move.

Data follows code.

games = []
file = File.new("jemptymethod.pgn", "r")

is_header = false
is_score = false

Game = Struct::new(:header, :score)

while (line = file.gets)
  if !line.chomp.empty?
    if !is_score && !is_header
      game = Game::new('','')
    end
    if /^\[/.match(line)
      is_header = true
      game.header << line
    else
      is_score = true
      game.score << line
    end
  else
    if is_score
      is_score = false
      is_header = false
      games << game
    end
  end
end

file.close
puts "# Games: " + games.length.to_s
moves_index = {}
first_moves = {}

games.each { |gm|
  #the following output should essentially be lossless
  #with the possible exception of beginning or ending newlines
  #
  #puts gm.header + "\n"
  #puts gm.score + "\n"

  score_tokens = gm.score.split(/\s+/);
  game_moves = []

  score_tokens.each_index{|i|
    if i%3 != 0
      move_token = score_tokens[i]
      if !moves_index.has_key?(move_token)
        moves_index[move_token] = moves_index.keys.length
      end
      game_moves << moves_index[move_token]
    end
  }

  first_move = moves_index.index(game_moves[0])

  if !first_moves.has_key?(first_move)
    first_moves[first_move] = 1
  else
    first_moves[first_move] = 1 + first_moves[first_move]
  end
}

# sorting hashes by value: http://nhw.pl/wp/2007/06/11/sorting-hash-by-values
first_moves.sort{|a,b| -1*(a[1]<=>b[1])}.each{|k,v|
  puts "1. #{k} occurred #{v} times" 
}

Data (just 3 games, I've been working with 25):

[Event "Enough With the Draws Already ;)"]
[Site "http://www.queenalice.com/game.php?id=533406"]
[Date "2009.2.1"]
[Round "-"]
[White "Troy"]
[Black "jemptymethod"]
[Result "1/2-1/2"]
[WhiteElo "1300"]
[BlackElo "2076"]
[ECO "C36"]

1. e4 e5 2. f4 exf4 3. Nf3 Be7 4. Bc4 Nf6 5. Qe2 d5 6. exd5 Nxd5 7. O-O Be6 8.
d4 Nc6 9. Nc3 O-O 10. Nxd5 Bxd5 11. Bxd5 Qxd5 12. Bxf4 Bd6 13. Qd2 Rae8 14. Bxd6
Qxd6 15. Rae1 h6 16. c3 Qd5 17. b3 Qa5 18. h3 a6 19. Rf2 Re7 20. Rxe7 Nxe7 21.
Ne5 Nd5 22. c4 Qxd2 1/2-1/2

[Event "AUTO-MASTER-620"]
[Site "http://www.queenalice.com/game.php?id=545265"]
[Date "2009.2.23"]
[Round "2"]
[White "testouverture"]
[Black "jemptymethod"]
[Result "1/2-1/2"]
[WhiteElo "2240"]
[BlackElo "2179"]
[ECO "A52"]

1. d4 Nf6 2. c4 e5 3. dxe5 Ng4 4. Nf3 Bc5 5. e3 Nc6 6. Be2 O-O 7. O-O Re8 8. b3
Ngxe5 9. Bb2 Nxf3+ 10. Bxf3 Ne5 11. Nc3 a5 12. Ne4 Bf8 13. Bh5 Ra6 14. f4 Ng6
15. Ng5 d5 16. Nxf7 Kxf7 17. f5 Kg8 18. fxg6 hxg6 19. Qd4 Qe7 20. Bf3 dxc4 21.
Qxc4+ Be6 22. Qc3 c6 23. Be2 Raa8 24. Bd3 Bf5 25. Bxf5 gxf5 26. Rf3 Qc5 27. Re1
Qxc3 28. Bxc3 g6 29. g4 Bg7 30. Bxg7 fxg4 31. Rg3 Kxg7 32. Rxg4 Rad8 33. Kf2
1/2-1/2

[Event "AUTO-MASTER-620"]
[Site "http://www.queenalice.com/game.php?id=545266"]
[Date "2009.2.23"]
[Round "2"]
[White "jemptymethod"]
[Black "testouverture"]
[Result "0-1"]
[WhiteElo "2079"]
[BlackElo "2306"]
[ECO "B22"]

1. e4 c5 2. c3 d5 3. exd5 Qxd5 4. d4 Nc6 5. dxc5 Qxd1+ 6. Kxd1 e5 7. Be3 Nf6 8.
b4 a5 9. b5 Ne7 10. Nf3 Ng4 11. Bc4 Nf5 12. Ke2 Nfxe3 13. fxe3 Bxc5 14. h3 Nxe3
15. Nxe5 f6 0-1
+2  A: 

Here's a quick solution to how I would do this. There may be a lot to digest in here, so feel free to ask questions, but reading the Ruby Array or Enumerable documentation should answer most of them about some of things I did, and there's tons of good tutorials on ruby classes. Here's a good one for understanding the accessors I used in the class here instead of the struct.

class Game
  attr_accessor :header, :moves
  def initialize
    self.header = []
  end
end

games = []
game = Game.new
File.open('jemptymethod.pgn').each_line do |line|
  next if line.chomp.empty?
  if game.moves
    games << game
    game = Game.new
  end
  if /^\[/.match(line)
    game.header << line
  else
    moves = line.split(/\d+\.\s*/) # splitting on the move numbers so that we don't have to iterate through to remove them
    moves.shift # getting rid of first empty move since the split on '1. ' created an array element before the '1. '
    game.moves = moves
  end
end
games << game # add last game since the first part of the file loop doesn't execute again to do it

puts "# Games: " + games.length.to_s

first_moves = games.map {|game| game.moves[0]} # Could easily iterate over the size of the longest game to get other moves (eg second move, etc)
first_moves_count = first_moves.inject(Hash.new(0)) {|h, move| h[move] += 1; h} # Read ruby documentation on inject to see how this works
first_moves_count.each do |move, count|
  puts "1. #{move} occurred #{count} times"
end
mmrobins
+1  A: 

I haven't done a complete refactoring because I want to keep enough of your original code intact that it's not too confusing. The major change is introduction of a Game class that handles parsing. The implementation of this class can be much improved, but it works without changing your code too much. Also, some minor points:

  • Instead of File.new, read the file using File.open and give it a block that takes the file parameter. The file is automatically closed at the end of the block.

  • Use of a += 1 instead of a = a + 1.

I made up a simple notation and wrote a parser for handling the play-by-play details of a tennis match. You might want to look at that code for an example of parsing game moves. It's actually very similar to what you're doing. The bulk of the code is in the /lib directory. Parsing logic is in parser.rb and game components are in the other files. I'd encourage you to break your chess games up in a similar way, by adding a Move class.

Anyway, here's my half-refactoring of your code:

class Game
  attr_accessor :header, :score, :moves

  def initialize
    @header = ""
    @score  = ""
    @moves  = []
  end

  def first_move
    moves_index.index(moves[0])
  end

  def moves_index
    moves_index = {}
    score.split(/\s+/).each_with_index do |move,i|
      if i%3 != 0
        unless moves_index.has_key?(move)
          moves_index[move] = moves_index.keys.length
        end
        moves << moves_index[move]
      end
    end
    moves_index
  end
end

games     = []
is_header = false
is_score  = false

File.open("jemptymethod.pgn") do |file|
  while (line = file.gets)
    if !line.chomp.empty?
      if !is_score && !is_header
        game = Game.new
      end
      if line[0,1] == '['
        is_header = true
        game.header << line
      else
        is_score = true
        game.score << line
      end
    elsif is_score
      is_score = false
      is_header = false
      games << game
    end
  end
end

puts "# Games: " + games.length.to_s
first_moves = {}

#the following output should essentially be lossless
#with the possible exception of beginning or ending newlines
#
#puts gm.header + "\n"
#puts gm.score + "\n"
games.each do |gm|
  if !first_moves.has_key?(gm.first_move)
    first_moves[gm.first_move] = 1
  else
    first_moves[gm.first_move] += 1
  end
end

# sorting hashes by value: http://nhw.pl/wp/2007/06/11/sorting-hash-by-values
first_moves.sort{|a,b| -1*(a[1]<=>b[1])}.each{|k,v|
  puts "1. #{k} occurred #{v} times" 
}
Alex Reisner
I accept this one because it produces the same output as my original program.
George Jempty