views:

46

answers:

2

GOAL My goal is to find a text file or library that enables me to map when given a mime type input and return a nice human friendly format.

For example given the mime type for Word (as shown below) I would like a result that is something like "Microsoft Office Word Document".

application/vnd.openxmlformats-officedocument.wordprocessingml.document

I realise I could compile my own list and use something like a Map (Java) but then it would not be comprehensive etc.

SIMPLISTIC OPTION I know I can examine and return the sub mime type and keep the last component but that is not very sophisticated as per the Word mime type above the result would be a very generic "document". I could expand and take more components but the result is still quite ugly.

KEY/VALUE FILE Another option I have tried to find is a text file with key/value pairs where the key is the mime type in full and the value being the nice human friendly text.

text/plain=Plain Text File
application/octet-stream=Unknown binary file

This seems like a nice option but I have not been able to find a definitive text file with lots of entries. It would also be nice if a source for just the media( i prefer to call it the primary mime type) the "text" in "text/plain" was present so an unknown text mime type such as "text/unknown a.b.c" would return "Unknown text file/format".

A: 

Apache Tika supports MimeTypes. It also supports Content Detection by the way if you don't know the mime type. Anyway, it looks like you need to do:

String t = "text/plain";
org.apache.tika.mime.MimeTypes.getMimeType(t).getDescription();

Disclaimer: I didn't actually try it. Also, I don't know if it supports all mime types you need.

Thomas Mueller
Thanks for spotting that. Inside tika-core.jar theres an xml file tika-mimetypes.xml which has a lot of mime types and descriptions defined within it. It looks like it should work... thanx again!
mP
Most of the entries in the xml are ignored because for some strage reason tika is setting descriptions from tags called "_comment" but not "description" etc. Going to file an issue/patch..
mP