views:

11339

answers:

8

I am looking for a simple way to get a mime type where the file extension is incorrect or not given, something similar to this question only in .Net.

A: 

This sounds similar to this question.

Greg Hewgill
+14  A: 

In Urlmon.dll, there's a function called FindMimeFromData.

From the documentation

MIME type detection, or "data sniffing," refers to the process of determining an appropriate MIME type from binary data. The final result depends on a combination of server-supplied MIME type headers, file extension, and/or the data itself. Usually, only the first 256 bytes of data are significant.

So, read the first (up to) 256 bytes from the file and pass it to FindMimeFromData.

Steve Morgan
How reliable is this method?
SkippyFire
+32  A: 

I did use urlmon.dll in the end. I thought there would be an easier way but this works. I include the code to help anyone else and allow me to find it again if I need it.

using System.Runtime.InteropServices;

...

    [DllImport(@"urlmon.dll", CharSet = CharSet.Auto)]
    private extern static System.UInt32 FindMimeFromData(
        System.UInt32 pBC,
        [MarshalAs(UnmanagedType.LPStr)] System.String pwzUrl,
        [MarshalAs(UnmanagedType.LPArray)] byte[] pBuffer,
        System.UInt32 cbSize,
        [MarshalAs(UnmanagedType.LPStr)] System.String pwzMimeProposed,
        System.UInt32 dwMimeFlags,
        out System.UInt32 ppwzMimeOut,
        System.UInt32 dwReserverd
    );

    public string getMimeFromFile(string filename)
    {
        if (!File.Exists(filename))
            throw new FileNotFoundException(filename + " not found");

        byte[] buffer = new byte[256];
        using (FileStream fs = new FileStream(filename, FileMode.Open))
        {
            if (fs.Length >= 256)
                fs.Read(buffer, 0, 256);
            else
                fs.Read(buffer, 0, (int)fs.Length);
        }
        try
        {
            System.UInt32 mimetype;
            FindMimeFromData(0, null, buffer, 256, null, 0, out mimetype, 0);
            System.IntPtr mimeTypePtr = new IntPtr(mimetype);
            string mime = Marshal.PtrToStringUni(mimeTypePtr);
            Marshal.FreeCoTaskMem(mimeTypePtr);
            return mime;
        }
        catch (Exception e)
        {
            return "unknown/unknown";
        }
    }
rsg
Thanks for the code!
SkippyFire
How reliable is this?
JL
or, more precisely, which MIME types are supported?
flq
Probably whatever is mapped in the registry.
mkmurray
@flq, @mkmurray http://msdn.microsoft.com/en-us/library/ms775147(VS.85).aspx#Known_MimeTypes
Ahmad
A: 

Use forensictools.net for exact solution

A: 

this returns application/octet-stream for any file (.txt,.mp3,.pdf)

When you say "This returns...." , what exactly is this...?
JL
+3  A: 

You can also look in the registry.

    using System.IO;
    using Microsoft.Win32;

    string GetMimeType(FileInfo fileInfo)
    {
        string mimeType = "application/unknown";

        RegistryKey regKey = Registry.ClassesRoot.OpenSubKey(
            fileInfo.Extension.ToLower()
            );

        if(regKey != null)
        {
            object contentType = regKey.GetValue("Content Type");

            if(contentType != null)
                mimeType = contentType.ToString();
        }

        return mimeType;
    }

One way or another you're going to have to tap into a database of MIMEs - whether they're mapped from extensions or magic numbers is somewhat trivial - windows registry is one such place. For a platform independent solution though one would have to ship this DB with the code (or as a standalone library).

Serguei
+1  A: 

I think the right answer is a combination of Steve Morgan's and Serguei's answers. That's how Internet Explorer does it. The pinvoke call to FindMimeFromData works for only 26 hard-coded mime types. Also, it will give ambigous mime types (such as text/plain or application/octet-stream) even though there may exist a more specific, more appropriate mime type. If it fails to give a good mime type, you can go to the registry for a more specific mime type. The server registry could have more up-to-date mime types.

Refer to: http://msdn.microsoft.com/en-us/library/ms775147%28VS.85%29.aspx

Jamey
A: 

what if we need to find file extension from mimetype?

Prakash