tags:

views:

31

answers:

2

I'm looking for a surefire way of determining the codec used in an audio or video file. The two things I am currently using are the file extension (obvious), and the mime type as determined by running `file -ib' on the file.

This doesn't seem to get me all the way there: loads of formats are `wrapper' formats that hide the exact codec used within -- for example, '.ogg' files can internally use the Vorbis, Speex, or FLAC codecs. Their MIME type is also usually hidden under 'application/ogg' or similar.

The `file' program is apparently able to tell me which codec is used, but it returns this as human-readable prose:

kb.ogg: Ogg data, Vorbis audio, stereo, 44100 Hz, ~0 bps

and as such it is dodgy to use programmatically.

What I'm essentially asking is: is there a script out there (any language) that can wade through these wrapper formats and tell me what the meat of the file is made of?

+1  A: 

ffmpeg includes a library called libavformat that can open and demux pretty much any media format. Obviously that's more than you actually need, but I don't think you can find anything else that's quite as complete. I've used it myself with great success. Take a look at this article for an introduction. There's also bindings for these libraries for some common scripting languages, such as python.

(If you don't want to build something using the library, you can probably use the regular ffmpeg binary.)

Emil H
A: 

You can always use your own magic file, copied and modified from the pre-installed magic file, and change the return string so that it can be easily parsed by your program.

See:

slebetman