tags:

views:

54

answers:

2

I'm building a specialized pipeline, and basically, every step in the pipeline involves taking one file as input and creating a different file as output. Not all files are in the same directory, all output files are of a different format, and because I'm using several different programs, different actions have to be taken to appease the different programs. This has led to some complicated file management in my code, and the more I try to organize the file directories, the more ugly it's getting. Just about every class involves some sort of code like the following:

@fileName = File.basename(file)
@dataPath = "#{$path}/../data/"

MzmlToOther.new("mgf", "#{@dataPath}/spectra/#{@fileName}.mzML", 1, false).convert

system("wine readw.exe --mzXML #{@file}.raw #{$path}../data/spectra/#{File.basename(@file + ".raw", ".raw")}.mzXML 2>/dev/null")

fileName = "#{$path}../data/" + parts[0] + parts[1][6..parts[1].length-1].chomp(".pep.xml")

Is there some sort of design pattern, or ruby gem, or something to clean this up? I like writing clean code, so this is really starting to bother me.

+1  A: 

You could use a Makefile.

Make is essential a DSL designed for handling converting one type of file to another type via running an external program. As an added bonus, it will handle only performing the steps necessary to incrementally update your output if some set of source files change.

If you really want to use Ruby, try a rakefile. Rake will do this, and it's still Ruby.

Borealid
As I mentioned, this is a *specialized* pipeline. There's a lot more going on then what I think a Makefile alone could do.
Jesse J
@Jesse J Since a Makefile can run arbitrary external scripts, I really really doubt you have something it can't do. Since, you know, it's Turing-complete. But my answer also mentions Rake - which lets you have arbitrary Ruby code interspersed into your targets.
Borealid
A: 

You can make this as sophisticated as you want but this basic script will match a file suffix to a method which you can then call with the file path.

# a conversion method can be used for each file type if you want to
# make the code more readable or if you need to rearrange filenames.
def htm_convert file
  "HTML #{file}"
end

# file suffix as key, lambda as value, the last uses an external method
routines = {
  :log => lambda {|file| puts "LOG #{file}"},
  :rb => lambda {|file| puts "RUBY #{file}"},
  :haml => lambda {|file| puts "HAML #{file}"},
  :htm => lambda {|file| puts htm_convert(file) }
}

# this loops recursively through the directory and sub folders
Dir['**/*.*'].each do |f|
  suffix = f.split(".")[-1]
  if routine = routines[suffix.to_sym]
    routine.call(f)
  else
    puts "UNPROCESSED -- #{f}"
  end
end
Joc