I actually wrote an application that does this very thing.
I started with a previous application that used a basic Levenshtein Distance algorithm to compute image similarity, but that method is undesirable for a number of reasons. Without a doubt, the fastest algorithm you're going to find for determining image similarity is either mean squared error or mean absolute error (both have a running time of O(n), where n is the number of pixels in the image, and it'd also be trivial to thread an implementation of either algorithm in a number of different ways). Mecki's post is actually just a Mean Absolute Error implementation, which my application can perform (code is also available for your browsing pleasure, should you so desire).
In any event, in our application, we first down-sample images (e.g. everything is scaled to, say, 32*32 pixels), then convert to gray scale, and then run the resulting images through our comparison algorithms. We're also working on some more advanced pre-processing algorithms to further normalize images, but...not quite there yet.
There are definitely better algorithms than MSE/MAE (in fact, the problems with these two algorithms as applied to visual information has been well documented), like SSIM, but it comes at a cost. Other people attempt to compare other visual qualities in the image, such as luminance, contrast, color histograms, etc., but it's all pricey compared to simply measuring the error signal.
My application might work, depending on how many images are in those folders. It's multi-threaded (I've seen it fully load eight processor cores performing comparisons), but I've never tested against an image database larger than a few hundred images. A few hundred gigs of images sounds prohibitively large. (simply reading them from disk, downsampling, converting to gray scale and storing in memory--assuming you have enough memory to hold everything, which you probably don't--could take a couple hours).