views:

279

answers:

4

Hello!

I am quite new to ruby but enjoying it so far quite immensely. There are some things that have given me some trouble and the following is no exception.

What I am trying to do here is create a sort of 'super-directory' by sub-classing 'Dir'. I've added a method called 'subdirs' that is designed to list the directory object's files and push them into an array if the file is a directory itself. The issue is, the results from my test (File.directory?) is strange - here is my method code:

  def subdirs
    subdirs = Array.new
    self.each do |x|
      puts "Evaluating file: #{x}"
      if File.directory?(x)
        puts "This file (#{x}) was considered a directory by File.directory?"
        subdirs.push(x)
        #yield(x) if block_given?
      end
    end
    return subdirs
  end

And strangely, even though there are plenty of directories in the directory I've chosen ("/tmp") - the result of this call only lists "." and ".."

puts "Testing new Directory custom class: FileOps/DirClass"

nd   = Directory.new("/tmp")
subs = nd.subdirs

And the results:

Evaluating file: mapping-root
Evaluating file: orbit-jvxml
Evaluating file: custom-directory
Evaluating file: keyring-9x4JhZ
Evaluating file: orbit-root
Evaluating file: .
This file (.) was considered a directory by File.directory?
Evaluating file: .gdmFDB11U
Evaluating file: .X0-lock
Evaluating file: hsperfdata_mishii
Evaluating file: .X11-unix
Evaluating file: .gdm_socket
Evaluating file: ..
This file (..) was considered a directory by File.directory?
Evaluating file: .font-unix
Evaluating file: .ICE-unix
Evaluating file: ssh-eqOnXK2441
Evaluating file: vesystems-package
Evaluating file: mapping-jvxml
Evaluating file: hsperfdata_tomcat
A: 

Are you executing the script from within /tmp? My guess (I haven't tried this) is that File.directory?(x) is testing to see if there's a directory named x in the current directory -- so, if you're running this from another directory, you'll always find . and .. but not the other directories.

Try changing File.directory?(x) to File.directory?("#{path}/#{x}")

Mark Westling
Mark, thank you - this is absolutely correct. I had thought that the directory object once instantiated as a particular directory would assume that location .. but nothing I know of works this way, I'm not sure why I assumed this would that way either. Oh well :0) Thanks!
Matt1776
A: 

I've made a few minor changes...

class Directory < Dir
  def subdirs
    subdirs = []
    each do |x|
      puts "Evaluating file: #{x}"
      if File.directory? File.join path, x
        puts "This file (#{x}) was considered a directory by File.directory?"
        subdirs << x
        #yield(x) if block_given?
      end
    end
    subdirs
  end
end
puts "Testing new Directory custom class: FileOps/DirClass"

nd   = Directory.new "/tmp"
puts subs = nd.subdirs
DigitalRoss
As I mentioned above, I think readability is important. Removing the explicit ( ) functional references is a bad idea, it introduces ambiguity at least to the casual observer. Many times people need to read code quickly to get the gist of it, and this is hard to do without clear delimiters ... File.join is definitely the way to go though ;0)
Matt1776
A: 

Replace * with whatever path you want and you're good to go. Glob gets you all the files in some path using bash globbing so you can use * and ** as well as ranges, etc. Pretty sweet.

The select works like the opposite of reject, cherry-picking only the values that are true within the select block.

Dir.glob("*").select {|f| File.directory?(f) }

Chuck Vose
Nice, I like this. Thank you.
Matt1776
+2  A: 

Mark Westling has already answered your immediate question, but since you mention you are new to Ruby, here are a couple of other style suggestions:

def subdirs
  subdirs = Array.new

It is generally preferred to use literal syntax where possible, so the common way to express this would be subdirs = [].

  self.each do |x|

self is the implicit receiver in Ruby, so you can just leave it off. (This is not Python, after all.) However, the primary purpose of code is communication, so if you believe, this communicates your intent better, leave it in.

Speaking of communication: x is not a terribly communicative name, unless you are talking about points in a cartesian coordinate system. How about file, or if you are uncomfortable with the Unix notion that directories are also files, the more neutral entry? (Actually, the best one would probably be path, but we will see soon how that might become confusing.)

My third suggestion is actually personal preference that is contrary to common Ruby style: common Ruby style dictates that one-line blocks are delimited with {/} and multiline blocks are delimited with do/end, just like you did. I don't like that, because it is an arbitrary distinction that doesn't convey any meaning. (Remember that "communication" thing?) So, I actually do things differently: if the block is imperative in nature and, for example, changes some global state, I use the keywords (because the block is actually doing some work) and if the block is functional in nature and just returns a new object, I prefer to use braces (because they look nicely mathematical). So, in this case, I would use braces.

    if File.directory?(x)

As Mark already explained, you need to do something like File.directory?(File.join(path, entry)) here, where Dir#path is a public attribute of the Dir class returning the path that Dir.new was initialized with.

Here you also see, why we didn't use path as the name of the block parameter. Otherwise we would need to write File.directory?(File.join(self.path, path)).

      subdirs.push(x)

The canonical way to append an element to an array, or indeed append pretty much anything to anything in Ruby, is the << operator. So, this should read subdirs << entry.

Array#push is an alias of Array#<<, which is mainly intended to let you use an Array as a stack (in conjunction with Array#pop).

    end
  end
  return subdirs

Here is another disconnect between my personal style and common Ruby style: in Ruby, if there is no explicit return, the return value is simply the value of the last expression. This means, you can leave off the return keyword and common Ruby style says you should. I, however, prefer to use this similar to the block delimiters: use return for methods that are functional in nature (because they actually "return" a value) and no return for imperative methods (because their real return value is not what comes after the return keyword, but what side-effects they have on the environment). So, like you, I would use a return keyword here, despite common style.

It is also customary, to seperate the return value from the rest of the method body by a blank line. (The same goes for setup code, by the way.)

end

So, this is where we stand right now:

def subdirs
  subdirs = []

  each { |entry|
    if File.directory?(File.join(path, entry))
      subdirs << entry
    end
  }

  return subdirs
end

As you can see, the if expression really only serves to skip one iteration of the loop. This is much better communicated by using the next keyword:

def subdirs
  subdirs = []

  each { |entry|
    next  unless File.directory?(File.join(path, entry))
    subdirs << entry
  }

  return subdirs
end

As you can see, we managed to remove one level of nesting from the block structure, which is always a good sign.

This idiom is called a "guard clause", and is quite popular in functional programming, where a lot of languages even have guard constructs built in, but it is also used heavily in pretty much any other language on the planet, because it greatly simplifies control flow: the idea is analogous to a guard posted outside a castle: instead of letting the bad guys into the castle (method, procedure, function, block, ...) and then painfully trying to track their every move and constantly being afraid to miss something or lose them, you simply post a guard at the entrance of your castle (the beginning of your method, block, ...) who doesn't let them in to begin with (which jumps to the end of the procedure, returns early from the method, skips one iteration of the loop, ...). In Ruby, you can use raise, return, next and break for this. In other languages, even GOTO is fine (this is one of those rare cases, where a GOTO can actually clear up control flow).

However, we can simplify this even further, by recognizing the iterator pattern: you take a list (the directory entries) and then you "squash" that list down to a single object (the subdirs array). To a category theorist, this screams "catamorphism", to a hardcore functional programmer "fold", to a softcore functional programmer "reduce", to a Smalltalk programmer "inject:into:" and to a Rubyist "Enumerable#inject":

def subdirs
  return inject([]) { |subdirs, entry|
    next subdirs  unless File.directory?(File.join(path, entry))
    subdirs << entry
  }
end

inject uses the return value of the previous iteration to seed the next one, that's why we have to return subdirs, even if we are skipping an iteration, by using next subdirs instead of plain next (which would return nil, so that on the next iteration subdirs would be nil and subdirs << entry would raise a NoMethodError.)

(Notice that I used curly braces for the block, even though the block actually modifies its argument. I feel this is still a "functional" block, though. YMMV.)

The last thing we can do is to recognize that what we are doing is just filtering (or in other words "selecting") elements of an array. And Ruby already has a method built in, which does exactly that: Enumerable#select. Witness the single-line awesomeness that using the full power of Ruby generates:

def subdirs
  return select { |entry| File.directory?(File.join(path, entry)) }
end

As a general tip: learn the wonders of Enumerable. It is the workhorse of Ruby programming, similar to IEnumerable<T> in .NET, dicts in Python, lists in functional languages or associative arrays in PHP and Perl.

Jörg W Mittag
Interesting. I left the self reference and the return reference for readability as you suggested, everything else seems like a good suggestion, though I think the traditional ruby block delimiters improve readability. Also the one liner is a "neat trick" but again I think I might use this enumerator over a few lines. I am partial to making things compact, yet experience has shown me how ill-advised this really is. I dont like when people come to me with questions about my code that I'd thought was 'obvious' ;0)
Matt1776
I suppose this isn't that unreadable, now looking at the last example .. I think I'm still acclimating to ruby.
Matt1776
Ok I'm not sure I understand enumerable, the each method of Dir is a method of enumerable and we are calling select on it? That doesn't sound right .. this is a little ambiguous don't you think? Explicit is always better than implicit - I don't work by myself, do you? ;0)
Matt1776