views:

10049

answers:

11

I'm loading an image from a file, and I want to know how to validate the image before it is fully read from the file.

string filePath = "image.jpg";
Image newImage = Image.FromFile(filePath);

The problem occurs when image.jpg isn't really a jpg. For example, if I create an empty text file and rename it to image.jpg, an OutOfMemory Exception will be thrown when image.jpg is loaded.

I'm looking for a function that will validate an image given a stream or a file path of the image.

Example function prototype

bool IsValidImage(string fileName);
bool IsValidImage(Stream imageStream);
+1  A: 

I would create a method like:

Image openImage(string filename);

in which I handle the exception. If the returned value is Null, there is an invalid file name / type.

Enreeco
LOL, I must've been writing that as a comment when you posted this. I agree with this answer, it's simple enough to get the job done.
Jason Bunting
This method is just kind of wrong. You should not control program flow using exceptions. Also.. The exceptions returned from that particular call can be *very* misleading and ambiguous.
Troy Howard
+9  A: 

JPEG's don't have a formal header definition, but they do have a small amount of metadata you can use.

  • Offset 0 (Two Bytes): JPEG SOI marker (FFD8 hex)
  • Offset 2 (Two Bytes): Image width in pixels
  • Offset 4 (Two Bytes): Image height in pixels
  • Offset 6 (Byte): Number of components (1 = grayscale, 3 = RGB)

There are a couple other things after that, but those aren't important.

You can open the file using a binary stream, and read this initial data, and make sure that OffSet 0 is 0, and OffSet 6 is either 1,2 or 3.

That would at least give you slightly more precision.

Or you can just trap the exception and move on, but I thought you wanted a challenge :)

FlySwat
I would have gone ahead and read the header for the file and comparedit to an array of .NET supported images' file headers. Eventually, I'll code that up and post it as a solution for anyone that would need it in the future.
SemiColon
Just reading the headers will not guarantee that the file is valid and won't throw an exception when opened in Image.FromFile().
MusiGenesis
No, but I didn't claim it would.
FlySwat
any sample code, please ?
alhambraeidos
A: 

You could read the first few bytes of the Stream and compare them to the magic header bytes for JPEG.

Quantenmechaniker
+4  A: 

Using Windows Forms:

bool IsValidImage(string filename)
{
    try
    {
        Image newImage = Image.FromFile(filename);
    }
    catch (OutOfMemoryException ex)
    {
        // Image.FromFile will throw this if file is invalid.
        // Don't ask me why.
        return false;
    }
    return true;
}

Otherwise if you're using WPF you can do the following:

bool IsValidImage(string filename)
{
    try
    {
        BitmapImage newImage = new BitmapImage(filename);
    }
    catch(NotSupportedException)
    {
        // System.NotSupportedException:
        // No imaging component suitable to complete this operation was found.
        return false;
    }
    return true;
}
MusiGenesis
Thanks :) . I was thinking about doing that, but I was wondering if there was a way to do this that is already built into the .NET framework. Since no one else mentioned any built-in functions in the .NET framework to do this, I believe that this would be a good solution.
SemiColon
You should probably catch OutOfMemoryException, which is the documented exception thrown if the file format is invalid. This means you would let FileNotFoundException propagate to the caller.
Joe
I didn't realize that was the documented exception for an invalid image file. I just assumed there could be different exceptions thrown based on what exactly was wrong with the file. Thanks.
MusiGenesis
It's a bizarre choice of exception for an invalid image file. My guess is that the designers didn't want to create a new exception type, and System.IO.InvalidDataException did not exist in .NET 1.x. Still seems wrong to choose something so non-intuitive.
Joe
I agree. When I see an OutOfMemoryException, I think "holy crap, I'm doing something that's using up too much memory", not "I'll bet my file isn't formatted correctly".
MusiGenesis
Do you realize that you're leaking a (potentially huge) image file? This answer can be summarized as "catch the exception", it's pointless to provide example code... when you're at it, you may well add a VB sample.
dbkk
@dbkk: the VB reference really hurt. :)
MusiGenesis
I am wondering is this the best solution to detect the file is really jpg?
Ervin Ter
@Ervin: the question asker didn't think so, but *I* do, obviously. In the context of programming, you're not trying to determine if a file is some sort of Platonic ideal of a JPEG; you're trying to determine whether your program can open it and display it. I think the best way is to let .Net try to open it and tell you if it can or can't do that.
MusiGenesis
This method is just kind of wrong. You should not control program flow using exceptions. Also.. The exceptions returned from that particular call can be *very* misleading and ambiguous.
Troy Howard
@Troy: go ahead and post a better alternative. Your answer is just a vague description of all the work you would have to do yourself to sort of achieve the same thing, followed by an admission that it would have to be surrounded by a try/catch block *anyway*. If the Image class had a built-in `TryFromFile()` or `FileIsValid()`, it would make sense to use that instead, but the above approach accomplishes the same thing with a minimum of fuss.
MusiGenesis
@MusiGenesis: I've got this in a library at work. I'll dig it up and post it soon. Mainly, the thing I object to is *detecting* via the try...catch. Especially given the fact that the image will be loaded by that call if successful. That's a huge amount of overhead for a "detection" especially when you consider that the most likely next line of code after a positive detection is to open the image file... Thus incurring double cost for success! Not to mention stomping on the runtime with gobbled exception handling. This is just not a great solution (though it would be functional).
Troy Howard
@MusiGenisis: (cont)... One way to resolve this is to place an out parameter in your check. Like a TryParse method. Change the signature to: bool IsValidImage(string filename, out Image image) and return the successfully loaded Image object to the caller. That also deals with the fact that it's not being properly disposed of in your example code. It then leaves it up to the caller to dispose of it when they are ready.
Troy Howard
@Troy: your point about *detecting* with a try/catch around the FromFile attempt is totally valid - it *does* carry the cost of actually opening a valid file (returning the Image in an `out` parameter is an excellent idea, in case the method isn't being used *just* for file validation). However, I would argue that being able to be opened as an `Image` is actually the *only* thing that makes a file "valid" in this case, and thus actually trying to open it as an `Image` is the only way to be 100% certain that the file is valid.
MusiGenesis
OutOfMemoryException is indeed the correct exception to trap according to MSDN!!! http://msdn.microsoft.com/en-us/library/stf701f5.aspx Microsoft, you never cease to amaze and baffle.
James
+2  A: 

You can do a rough typing by sniffing the header.

This means that each file format you implement will need to have a identifiable header...

JPEG: First 4 bytes are FF D8 FF E0 (actually just the first two bytes would do it for non jfif jpeg, more info here).

GIF: First 6 bytes are either "GIF87a" or "GIF89a" (more info here)

PNG: First 8 bytes are: 89 50 4E 47 0D 0A 1A 0A (more info here)

TIFF: First 4 bytes are: II42 or MM42 (more info here)

etc... you can find header/format information for just about any graphics format you care about and add to the things it handles as needed. What this won't do, is tell you if the file is a valid version of that type, but it will give you a hint about "image not image?". It could still be a corrupt or incomplete image, and thus crash when opening, so a try catch around the .FromFile call is still needed.

Troy Howard
hmm.. four people answered while I was typing that and collecting links. Busy place.
Troy Howard
+3  A: 

Well, I went ahead and coded a set of functions to solve the problem. It checks the header first, then attempts to load the image in a try/catch block. It only checks for GIF, BMP, JPG, and PNG files. You can easily add more types by adding a header to imageHeaders.

static bool IsValidImage(string filePath)
{
    return File.Exists(filePath) && IsValidImage(new FileStream(filePath, FileMode.Open, FileAccess.Read));
}

static bool IsValidImage(Stream imageStream)
{
    if(imageStream.Length > 0)
    {
        byte[] header = new byte[4]; // Change size if needed.
        string[] imageHeaders = new[]{
                "\xFF\xD8", // JPEG
                "BM",       // BMP
                "GIF",      // GIF
                Encoding.ASCII.GetString(new byte[]{137, 80, 78, 71})}; // PNG

        imageStream.Read(header, 0, header.Length);

        bool isImageHeader = imageHeaders.Count(str => Encoding.ASCII.GetString(header).StartsWith(str)) > 0;
        if (isImageHeader == true)
        {
            try
            {
                Image.FromStream(imageStream).Dispose();
                imageStream.Close();
                return true;
            }

            catch
            {

            }
        }
    }

    imageStream.Close();
    return false;
}
SemiColon
This code doesn't dispose ImageStream if IsValidImage returns false.
Joe
Thank you very much. I fixed the bug.
SemiColon
Not quite. If imageStream.Read throws an exception, you still don't close it. Best to put a using statement around the stream instantiation.
Joe
@Joe I must disagree. He should not be closing or disposing of the stream in this function. This function didn't create the stream, and so should not perform unexpected behaviours. Also.. In case of success, Image.FromStream will consume the stream (which might be readonly, and can't be reset) meaning that a subsequent read of the stream later would fail since the stream had already been consumed. Also, upon success the image is loaded (very costly) and then disposed of immediately. If this method return true, it's likely the caller will load the image on the next line. So that's double work.
Troy Howard
A: 

Do you guys know these informations for tif files? (single and multipage)

I didn't find the Image class, I don't think it is System.Net.Mime.MediaTypeNames.Image, is it?

Victor Rodrigues
A: 

in case yo need that data read for other operations and/or for other filetypes (PSD for example), later on, then using the Image.FromStream function is not necessarily a good ideea.

lorddarq
+1  A: 

A method that supports Tiff and Jpeg also

private bool IsValidImage(string filename)
{
    Stream imageStream = null;
    try
    {
        imageStream = new FileStream(filename, FileMode.Open);

        if (imageStream.Length > 0)
        {
            byte[] header = new byte[30]; // Change size if needed.
            string[] imageHeaders = new[]
            {
                "BM",       // BMP
                "GIF",      // GIF
                Encoding.ASCII.GetString(new byte[]{137, 80, 78, 71}),// PNG
                "MM\x00\x2a", // TIFF
                "II\x2a\x00" // TIFF
            };

            imageStream.Read(header, 0, header.Length);

            bool isImageHeader = imageHeaders.Count(str => Encoding.ASCII.GetString(header).StartsWith(str)) > 0;
            if (imageStream != null)
            {
                imageStream.Close();
                imageStream.Dispose();
                imageStream = null;
            }

            if (isImageHeader == false)
            {
                //Verify if is jpeg
                using (BinaryReader br = new BinaryReader(File.Open(filename, FileMode.Open)))
                {
                    UInt16 soi = br.ReadUInt16();  // Start of Image (SOI) marker (FFD8)
                    UInt16 jfif = br.ReadUInt16(); // JFIF marker

                    return soi == 0xd8ff && (jfif == 0xe0ff || jfif == 57855);
                }
            }

            return isImageHeader;
        }

        return false;
    }
    catch { return false; }
    finally
    {
        if (imageStream != null)
        {
            imageStream.Close();
            imageStream.Dispose();
        }
    }
}
Paulo
A: 

it worked although code is little crude but it will work.

Alok Kumar
A: 

This should do the trick - you don't have to read raw bytes out of the header:

using(Image test = Image.FromFile(filePath))
{
    bool isJpeg = (test.RawFormat.Equals(ImageFormat.Jpeg));
}

Of course, you should trap the OutOfMemoryException too, which will save you if the file isn't an image at all.

And, ImageFormat has pre-set items for all the other major image types that GDI+ supports.

Note, you must use .Equals() and not == on ImageFormat objects (it is not an enumeration) because the operator == isn't overloaded to call the Equals method.

David