views:

359

answers:

2

Hi!

I'm developing a file upload with JSF. The application saves three dates about the file:

  • Filename
  • Bytes
  • Content-Type as submitted by the browser.

My problem is that some files are saved with content type = application/octet-stream even if they are *.doc files oder *.pdf.

When does the browser submits such a content type?
I would like to clean up the database so I need to know when the browser information are incorrect.

+1  A: 

It depends on the OS, the browser, and how the user has configured them. It's based on the way the browser determines the file type of local files (to display them). On most OS/browser combinations this is based on the file's extension, but on some it may be determined by other means. (eg: on Mac OS)

In ay case, you shouldn't really rely on the Content-type sent by the browser. The best approach would be to actually look at the contents of the file. You could probably also use the filename, but keep in mind that browsers aren't necessarily going to be good about telling you that either (though it's probably still a lot more reliable than the Content-type they send).

Laurence Gonsalves
+1  A: 

Ignore the value sent by the browser. This is indeed dependent on the client platform, browser and configuration used.

If you want full control over content types based on the file extension, then better determine it yourself using ServletContext#getMimeType().

String mimeType = servletContext.getMimeType(filename);

The default mime types are definied in the web.xml of the servletcontainer in question. In for example Tomcat, it's located in /conf/web.xml. You can extend/override it in the webapp's /WEB-INF/web.xml as follows:

<mime-mapping>
    <extension>xlsx</extension>
    <mime-type>application/vnd.openxmlformats-officedocument.spreadsheetml.sheet</mime-type>
</mime-mapping>

You can also determine the mime type based on the actual file content (because the file extension may not per se be accurate, it can be fooled by the client), but this is a lot of work. Consider using a 3rd party library to do all the work. I've found JMimeMagic useful for this. You can use it as follows:

String mimeType = Magic.getMagicMatch(file, false).getMimeType();

Note that it doesn't support all mimetypes as reliable. You can also consider a combination of both approaches. E.g. if the one returns null or application/octet-stream, use the other. Or if both returns a different but "valid" mimetype, prefer the one returned by JMimeMagic.

Oh, I almost forgot to add, in JSF you can obtain the ServletContext as follows:

ServletContext servletContext = (ServletContext) FacesContext.getCurrentInstance().getExternalContext().getContext();
BalusC