views:

5248

answers:

8

How does one reliably determine a file's type? File extension analysis is not acceptable. There must be a rubyesque tool similar to the UNIX file(1) command?

This is regarding MIME or content type, not file system classifications, such as directory, file, or socket.

+3  A: 

You could give this a go.

Bobby Jack
From Readme.txt: "The identification of MIME content type is based on a file‘s filename extensions". OP explicitly requested a method based on content analysis, not filename extension.
Martin Carpenter
+8  A: 

If you're on a Unix machine try this:

mimetype = `file -Ib #{path}`.gsub(/\n/,"")

I'm not aware of any pure Ruby solutions that work as reliably as 'file'.

Edited to add: depending what OS you are running you may need to use 'i' instead of 'I' to get file to return a mime-type.

Patrick Ritchie
+2  A: 

You could give shared-mime a try (gem install shared-mime-info). Requires the use ofthe Freedesktop shared-mime-info library, but does both filename/extension checks as well as "magic" checks... tried giving it a whirl myself just now but I don't have the freedesktop shared-mime-info database installed and have to do "real work," unfortunately, but it might be what you're looking for.

Chris Ingrassia
+14  A: 

There is a ruby binding to libmagic that does what you need. This is available as a gem:

gem install ruby-filemagic

The documentation seems a little thin, but this should get you started:

$ irb 
irb(main):001:0> require 'filemagic' 
=> true
irb(main):002:0> fm = FileMagic.new
=> #<FileMagic:0x7fd4afb0>
irb(main):003:0> fm.file('foo.zip') 
=> "Zip archive data, at least v2.0 to extract"
irb(main):004:0>
Martin Carpenter
According to http://grub.ath.cx/filemagic/CHANGELOG this gem doesn't seem to be actively maintained.
Lars Haugseth
A: 

The ruby gem is well. mime-types for ruby

Qianjigui
This gem uses file extention to determine the type, not the content.
Lars Haugseth
Thanks for your response. This method is not a good idea.
Qianjigui
A: 

I couldn't get to install the ruby-filemagic

and mimetype = file -Ib #{path}.gsub(/\n/,"") gives the same result for .odt and .docx files "application/x-zip" at least on ubuntu

I also tried the mimetype_fu gem which basically does a

file -mime -br "filename"

equivalent to the file -Ib from above

Alexis Perrier
This is because actually odt and docx are zip files. You'll have to work around this by further looking at the contents of the zip or at the file extension.
hurikhan77
A: 

I recently found mimetype-fu. It seems to be the easiest reliable solution to get a file's mime type. The only caveat is that on a Windows machine it only uses the file extension; on Linux/OS X/Other Unix-y systems it works great.

heathanderson