views:

335

answers:

5

I hear this term sometimes and am wondering what it is used for?

+2  A: 

hashing in general is a useful way to reduce a huge amount of data to a short(ish) number that can be used to identify that image.

They are sometimes intended just to provide a handy way to identify a file without the intervention of a human, especially in the presence of several parallel authors who can't be relied upon to increment some master counter (JPG001 JPG002) without overlapping.

Sometimes hashes are intended to be unforgeable, so that I can say - if the image hash YOU generate is the same as the one I made when I sent you the image, then you can be sure it's from me (and not adjusted by an evildoer). However, not all hashes can make this guarantee, an every few years a popular such 'cryptographic' hash is shown to have fatal flaws.

Alex Brown
So would you use it if you want to assign a unique hard to guess name to the image?
seanbrant
No, it's not hard to guess, it's directly derivable from the image itself (which is presumably accessible by the people you don't want to guess right). You may be able to transform or watermark the image first, which will change the image so the hash no longer works, but I'm not sure if you haven't just destroyed the value of the hash.
Alex Brown
A: 

Umm.... To compare images (in the broad sense, pictures, or any other binaries) fast without comparing entire file?

zvolkov
Well, as long as you are clear that two images that might appear basically the same, or exactly the same, or even only differ in metadata, then yes.
Alex Brown
...differ in metadata, but still not match according to this 'comparison', then yes.
Alex Brown
A: 

In practice, image hashing is popular for finding similar images in a sequence of frames or video, or to embed a watermark with various images as many of the movie studios now do (almost hearken back to Fight Club in a creepy sense!).

mduvall
Not similar, but exactly the same...
Arjan Einbu
+1  A: 

It is often used for image caching. For example, an instant messenger client can cache user avatar images by first requesting only the hash of the image from the server and if it doesn't find it in the cache, it will download the image.

DrJokepu
+1  A: 

While normally hashing a file hashes the individual bits of data of the file, image hashing works on a slightly higher level. The difference is that with image hashing, if two pictures look practically identical but are in a different format, or resolution (or there is minor corruption, perhaps due to compression) they should hash to the same number. Despite the actual bits of their data being totally different, if they look parctically identical to a human, they hash to the the same thing.

One application of this is search. TinEye.com allows you to upload an image and find many of its occurrences on the internet. like google, it has a web crawler that crawls through web pages and looks for images. It then hashes these images and stores the hash and url in a database. When you upload an image, it simply calculates the hash and retrieves all the urls linking to that hash in the database. Sample uses of TinEye include finding higher resolution versions of pictures, or finding someone's public facebook/myspace/etc. profile from their picture (assuming these profiles use the same photo.

Image hashing can also be used with caching or local storage to prevent retransmission of a photo or storage of duplicates, respectively.

There are plenty of other possibilities including image authentication and finding similar frames in a video (as was mentioned by someone else).

Matt Boehm