views:

72

answers:

4

I'm making a search engine for our gigantic PHP codebase.

Given a filepath, how can I determine with some degree of certainty whether a file is a text file, or some other type? I'd prefer not to have to resort to file extensions (like substr($filename, -3) or something silly), as this is a linux based filesystem, so anything goes as far as file extensions are concerned.

I'm using RecursiveDirectoryIterator, so I have those methods available too..

A: 

You can invoke the file utility:

echo `file '$file'`;

Returns things like:

$ file test.out
test.out: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, not stripped

$ file test.cpp
test.cpp: ASCII C program text

$ file test.txt
test.txt: ASCII text

$ file test.php
test.php: PHP script text
meagar
Bill Karwin's suggestion (finfo) uses the same libraries as the file utility. It will be much faster though, because it's a php extension
Evert
@Evert True enough, and definitely the way to go if you have the extension along with PHP >= 5.3
meagar
+1  A: 
if (mime_content_type($path) == "text/plain") {
 echo "I'm a text file";
}
softcr
Yes, but note that it's deprecated as of PHP5.3 in favor of `finfo_file`
Gordon
+7  A: 

Try the finfo_file() function.

Here's a blog describing its usage: Smart File Type Detection Using PHP

Bill Karwin
Sweet.. a lot slower than I was hoping it'd be, but I guess it'll have to do. Thanks!
Stephen J. Fuhry
A: 

Try to use: string mime_content_type ( string $filename )

Hope this is helpful.

William Choi

William Choi