I am looking for a simple way to get a mime type where the file extension is incorrect or not given, something similar to this question only in .Net.
views:
11339answers:
8Using .NET, how can you find the mime type of a file based on the file signature not the extension.
In Urlmon.dll, there's a function called FindMimeFromData.
From the documentation
MIME type detection, or "data sniffing," refers to the process of determining an appropriate MIME type from binary data. The final result depends on a combination of server-supplied MIME type headers, file extension, and/or the data itself. Usually, only the first 256 bytes of data are significant.
So, read the first (up to) 256 bytes from the file and pass it to FindMimeFromData.
I did use urlmon.dll in the end. I thought there would be an easier way but this works. I include the code to help anyone else and allow me to find it again if I need it.
using System.Runtime.InteropServices;
...
[DllImport(@"urlmon.dll", CharSet = CharSet.Auto)]
private extern static System.UInt32 FindMimeFromData(
System.UInt32 pBC,
[MarshalAs(UnmanagedType.LPStr)] System.String pwzUrl,
[MarshalAs(UnmanagedType.LPArray)] byte[] pBuffer,
System.UInt32 cbSize,
[MarshalAs(UnmanagedType.LPStr)] System.String pwzMimeProposed,
System.UInt32 dwMimeFlags,
out System.UInt32 ppwzMimeOut,
System.UInt32 dwReserverd
);
public string getMimeFromFile(string filename)
{
if (!File.Exists(filename))
throw new FileNotFoundException(filename + " not found");
byte[] buffer = new byte[256];
using (FileStream fs = new FileStream(filename, FileMode.Open))
{
if (fs.Length >= 256)
fs.Read(buffer, 0, 256);
else
fs.Read(buffer, 0, (int)fs.Length);
}
try
{
System.UInt32 mimetype;
FindMimeFromData(0, null, buffer, 256, null, 0, out mimetype, 0);
System.IntPtr mimeTypePtr = new IntPtr(mimetype);
string mime = Marshal.PtrToStringUni(mimeTypePtr);
Marshal.FreeCoTaskMem(mimeTypePtr);
return mime;
}
catch (Exception e)
{
return "unknown/unknown";
}
}
You can also look in the registry.
using System.IO;
using Microsoft.Win32;
string GetMimeType(FileInfo fileInfo)
{
string mimeType = "application/unknown";
RegistryKey regKey = Registry.ClassesRoot.OpenSubKey(
fileInfo.Extension.ToLower()
);
if(regKey != null)
{
object contentType = regKey.GetValue("Content Type");
if(contentType != null)
mimeType = contentType.ToString();
}
return mimeType;
}
One way or another you're going to have to tap into a database of MIMEs - whether they're mapped from extensions or magic numbers is somewhat trivial - windows registry is one such place. For a platform independent solution though one would have to ship this DB with the code (or as a standalone library).
I think the right answer is a combination of Steve Morgan's and Serguei's answers. That's how Internet Explorer does it. The pinvoke call to FindMimeFromData works for only 26 hard-coded mime types. Also, it will give ambigous mime types (such as text/plain or application/octet-stream) even though there may exist a more specific, more appropriate mime type. If it fails to give a good mime type, you can go to the registry for a more specific mime type. The server registry could have more up-to-date mime types.
Refer to: http://msdn.microsoft.com/en-us/library/ms775147%28VS.85%29.aspx