views:

410

answers:

4

Hi all,

I've a web page that that can be used to upload files.
Now I need to check if the file type is correct (zip, jpg, pdf,...).

I can use the mimeType that comes with the request but I don't trust the user and let's say I want to be sure that nobody is able to upload a .gif file that was renamed in .jpg
I think that in this case I should inspect the magic number.
This is a java library I've found that seems to achieve what I need 'extract the mimetype from the magic number'.
Is this a correct solution or what do you suggest?

UPDATE: I've found the mime-util project and it seems very good and up-to-date! (maybe better then Java Mime Magic Library?)
Here is a list of utility projects that can help you to extract mime-types

+2  A: 

Try Java Mime Magic Library

byte[] data = ...
MagicMatch match = Magic.getMagicMatch(data);
String mimeType = match.getMimeType();
sfussenegger
A: 

The activation framework is Sun's answer to this. And you may well have this already in the classpath of your app server

James B
I tried activation framework's getContentType() over some .pdf, .xls files but unfortunately the method is always returning 'application/octet-stream'. only for .txt is giving something like 'text/plain'
al nik
actually the getContentType only maps the file based on the file extension and a map of mimeType that you provide... this is not what I'm looking for
al nik
I agree, that's not what you're looking for!
James B
I'll get me coat...
James B
A: 

Hi,

I'm sure the library posted by @sfussenegger is the best solution, but I do it by hand with the following snippet that I hope it could help you.

DESCONOCIDO("desconocido", new byte[][] {}), PDF("PDF",
   new byte[][] { { 0x25, 0x50, 0x44, 0x46 } }), JPG("JPG",
   new byte[][] { { (byte) 0xff, (byte) 0xd8, (byte) 0xff,
     (byte) 0xe0 } }), RAR("RAR", new byte[][] { { 0x52,
   0x61, 0x72, 0x21 } }), GIF("GIF", new byte[][] { { 0x47, 0x49,
   0x46, 0x38 } }), PNG("PNG", new byte[][] { { (byte) 0x89, 0x50,
   0x4e, 0x47 } }), ZIP("ZIP", new byte[][] { { 0x50, 0x4b } }), TIFF(
   "TIFF", new byte[][] { { 0x49, 0x49 }, { 0x4D, 0x4D } }), BMP(
   "BMP", new byte[][] { { 0x42, 0x4d } });

Regards.

PD: The best of it is that it doesn't have any dependency. PD2: No warranty about it's correctness! PD3: "desconocido" stands for "unknown" (in spanish)

ATorras
A: 

A standard Java SE solution is the URLConnection#guessContentTypeFromStream(). It is however only a bit poor located in the API (I would have placed it somewhere in java.io rather than in java.net). It also doesn't support much of the content types (only the Java definied formats, some text formats, a bunch of image formats and a few of audio/video formats, but certainly no zip or pdf formats). So I would say, the aforementioned Java Mime Magic Library is certainly the way to go.

BalusC