views:

1308

answers:

5

I want to get a MIME Content-Type from a given extension (preferably without accessing the physical file). I have seen some questions about this and the methods described to perform this can be resumed in:

  1. Use registry information.
  2. Use urlmon.dll's FindMimeFromData.
  3. Use IIS information.
  4. Roll your own MIME mapping function. Based on this table, for example.

I've been using no.1 for some time but I realized that the information provided by the registry is not consistent and depends on the software installed on the machine. Some extensions, like .zip don't use to have a Content-Type specified.

Solution no.2 forces me to have the file on disk in order to read the first bytes, which is something slow but may get good results.

The third method is based on Directory Services and all that stuff, which is something I don't like much because I have to add COM references and I'm not sure it's consistent between IIS6 and IIS7. Also, I don't know the performance of this method.

Finally, I didn't want to use my own table but at the end seems the best option if I want a decent performance and consistency of the results between platforms (even Mono).

Do you think there's a better option than using my own table or one of other described methods are better? What's your experience?

+1  A: 

It depends what you need the MIME type for. In general, for services (web apps, web service, etc.), it's advisable not to use a MIME list which is dependent on the OS settings, or only as fallback if you cannot find MIME information otherwise.

I think that this is also the reason why MS chose to put constant MIME types in their System.Web.MimeMapping class (unfortunately it's internal, for whatever reason).

Lucero
Interesting... didn't know that about the internal type, but seems revealing.
Marc Climent
A: 

I do not have a wide range of mime type to handle in my apps, so I'm using a lookup table like you.

For your idea wasn't that bad at all. Just need to have a easy way to maintain and update the lookup.

o.k.w
+5  A: 

I have combined all these approaches in my utility lib, except maybe no.3. Btw, no.2 (urlmon.dll) doesn't require static file, it simply takes some bytes no matter where they had come from. Here's my current class

namespace Components
{
    using System;
    using System.Collections.Generic;
    using System.IO;
    using System.Runtime.InteropServices;
    using System.Text;
    using System.Xml.Serialization;
    using Microsoft.Win32;

    public sealed class MimeExtensionHelper
    {
     private MimeExtensionHelper() { }

     /// <summary>Finds extension associated with specified mime type</summary>
     /// <param name="mimeType">mime type you search extension for, e.g.: "application/octet-stream"</param>
     /// <returns>most used extension, associated with provided type, e.g.: ".bin"</returns>
     public static string FindExtension(string mimeType)
     {
      return ExtensionTypes.GetExtension(mimeType);
     }

     /// <summary>Finds mime type using provided extension and/or file's binary content.</summary>
     /// <param name="file">Full file path</param>
     /// <param name="verifyFromContent">Should the file's content be examined to verify founded value.</param>
     /// <returns>mime type of file, e.g.: "application/octet-stream"</returns>
     public static string FindMime(string file,bool verifyFromContent)
     {
      string extension = Path.GetExtension(file);
      string mimeType = string.Empty;
      try
      {
       if (!String.IsNullOrEmpty(extension))
        mimeType = ExtensionTypes.GetMimeType(extension);
       if (verifyFromContent
        || (String.IsNullOrEmpty(mimeType) && File.Exists(file)))
        mimeType = FindMimeByContent(file,mimeType);
      }
      catch { }
      return (mimeType ?? string.Empty).Trim();//"application/octet-stream"
     }

     /// <summary>Finds mime type for file using it's binary data.</summary>
     /// <param name="file">Full path to file.</param>
     /// <param name="proposedType">Optional. Expected file's type.</param>
     /// <returns>mime type, e.g.: "application/octet-stream"</returns>
     public static string FindMimeByContent(string file
      ,string proposedType)
     {
      FileInfo fi = new FileInfo(file);
      if (!fi.Exists)
       throw new FileNotFoundException(file);
      byte[] buf = new byte[Math.Min(4096L,fi.Length)];
      using (FileStream fs = File.OpenRead(file))
       fs.Read(buf,0,buf.Length);
      return FindMimeByData(buf,proposedType);
     }

     /// <summary>Finds mime type for binary data.</summary>
     /// <param name="dataBytes">Binary data to examine.</param>
     /// <param name="mimeProposed">Optional. Expected mime type.</param>
     /// <returns>mime type, e.g.: "application/octet-stream"</returns>
     public static string FindMimeByData(byte[] dataBytes,string mimeProposed)
     {
      if (dataBytes == null || dataBytes.Length == 0)
       throw new ArgumentNullException("dataBytes");
      string mimeRet = String.Empty;
      IntPtr outPtr = IntPtr.Zero;
      if (!String.IsNullOrEmpty(mimeProposed))
       mimeRet = mimeProposed;
      int result = FindMimeFromData(IntPtr.Zero
       ,null
       ,dataBytes
       ,dataBytes.Length
       ,String.IsNullOrEmpty(mimeProposed) ? null : mimeProposed
       ,0
       ,out outPtr
       ,0);
      if (result != 0)
       throw Marshal.GetExceptionForHR(result);
      if (outPtr != null && outPtr != IntPtr.Zero)
      {
       mimeRet = Marshal.PtrToStringUni(outPtr);
       Marshal.FreeCoTaskMem(outPtr);
      }
      return mimeRet;
     }

     [DllImport("urlmon.dll"
      ,CharSet = CharSet.Unicode
      ,ExactSpelling = true
      ,SetLastError = true)]
     static extern Int32 FindMimeFromData(IntPtr pBC
      ,[MarshalAs(UnmanagedType.LPWStr)] String pwzUrl
      ,[MarshalAs(UnmanagedType.LPArray,ArraySubType = UnmanagedType.I1,SizeParamIndex = 3)] Byte[] pBuffer
      ,Int32 cbSize
      ,[MarshalAs(UnmanagedType.LPWStr)] String pwzMimeProposed
      ,Int32 dwMimeFlags
      ,out IntPtr ppwzMimeOut
      ,Int32 dwReserved);

     private static MimeTypeCollection _extensionTypes = null;
     private static MimeTypeCollection ExtensionTypes
     {
      get
      {
       if (_extensionTypes == null)
        _extensionTypes = new MimeTypeCollection();
       return _extensionTypes;
      }
     }

     [Serializable]
     [XmlRoot(ElementName = "mimeTypes")]
     private class MimeTypeCollection : List<MimeTypeCollection.mimeTypeInfo>
     {
      private SortedList<string,string> _extensions;
      private SortedList<string,List<string>> _mimes;

      private void Init()
      {
       if (_extensions == null || _mimes == null
        || _extensions.Count == 0 || _mimes.Count == 0)
       {
        _extensions = new SortedList<string,string>(StringComparer.OrdinalIgnoreCase);
        _mimes = new SortedList<string,List<string>>(StringComparer.OrdinalIgnoreCase);
        foreach (var mime in this)
        {
         _mimes.Add(mime.MimeType,new List<string>(mime.Extensions));
         foreach (string ext in mime.Extensions)
          if (!_extensions.ContainsKey(ext))
           _extensions.Add(ext,mime.MimeType);
        }
       }
      }

      public String GetExtension(string type)
      {
       Init();
       return _mimes.ContainsKey(type) ? _mimes[type][0] : string.Empty;
      }

      public String GetMimeType(string extension)
      {
       Init();
       return _extensions.ContainsKey(extension) ? _extensions[extension] : string.Empty;
      }

  public MimeTypeCollection()
  {
   this.Add(new mimeTypeInfo("application/applixware",new List<string>(new[] { ".aw" })));
   this.Add(new mimeTypeInfo("application/atom+xml",new List<string>(new[] { ".atom" })));
   // ... Whole list from apache's site
   this.Add(new mimeTypeInfo("x-x509-ca-cert",new List<string>(new[] { ".cer" })));
   try
   {
    using (RegistryKey classesRoot = Registry.ClassesRoot)
    using (RegistryKey typeKey = classesRoot.OpenSubKey(@"MIME\Database\Content Type"))
    {
     string[] subKeyNames = typeKey.GetSubKeyNames();
     string extension = string.Empty;
     foreach (string keyname in subKeyNames)
     {
      string trimmed = (keyname ?? string.Empty).Trim();
      if (string.IsNullOrEmpty(trimmed))
       continue;
      if (!String.IsNullOrEmpty(GetExtension(trimmed)))
       continue;
      string subKey = "MIME\\Database\\Content Type\\" + trimmed;
      using (RegistryKey curKey = classesRoot.OpenSubKey(subKey))
      {
       extension = (curKey.GetValue("Extension") as string ?? string.Empty).Trim();
       if (extension.Length > 0)
        this.Add(new mimeTypeInfo(trimmed
         ,new List<string>(new[] { extension })));
      }
     }
    }
   }
   catch (Exception ex)
   {
    string s = ex.ToString();
   }
  }

  [Serializable]
  public class mimeTypeInfo
  {
   [XmlAttribute(AttributeName = "mimeType")]
   public String MimeType { get; set; }

   [XmlElement("extension")]
   public List<String> Extensions { get; set; }

   public mimeTypeInfo(string mimeType,List<string> extensions)
   {
    MimeType = mimeType;
    Extensions = extensions;
   }

   public mimeTypeInfo() { }
  }
 }
}

}

Nisus
Thanks for the helpful reference impl + 1
DanP
A: 

Nisus - would you be willing to post the entire source code for your utility somewhere? that would really be helpful. thanks!

Never mind....

I edited the apache definition file to only contain entries with defined extensions, then extended the code to load in the types/extensions from the text file at run time. Not elegant perhaps but sure beats creating/maintaining 630 lines of source code for the mime types.

[in the constructor for MimeTypeCollection instead of this stuff: this.Add(new mimeTypeInfo("application/applixware",new List(new[] { ".aw" })));]

        // import mime/extension definition list to facilitate maintenance
    string dir = AppDomain.CurrentDomain.BaseDirectory;
    using (TextReader streamReader = new StreamReader(Path.Combine(dir, "MimeDefinitions.txt")))
    {
      string input;
      while ((input = streamReader.ReadLine()) != null)
      {
        if (input.Substring(0, 1) != "#")
        {
          // text line format ::= [contenttype]<tab>[space delimited list of extensions, without dot]
          string contentType = input.Group("0\t1");
          string extensionList = input.Group("1\t1");
          string[] extensions = extensionList.Split(" ".ToCharArray());
          List<string> extensionSet = new List<string>();
          foreach (string term in extensions)
          {
            extensionSet.Add("."+term);
          }
          this.Add(new mimeTypeInfo(contentType, extensionSet));
        }
      }
    }

I also found that the Init() method would be called and the _extensions and _mime members would not be completely initialized so I changed it to read:

if (_extensions == null || _mimes == null || _mimes.Count != this.Count)

Anyway, I now how a class that can handle the external defs and local registry I needed.

Thanks!

Toby
A: 

Toby, I'm working on a similar issue at work could you post the complete source(above) with your changes?

Thanks

itguy71