I'd have thought your approach seems reasonable, but reducing an entire image to 1x1 pixel in size is probably a step too far.
However, if you converted each image to the same size and then computed the average colour in each 16x16 (or 32x32, 64x64, etc. depending on how much processing time/power you wish to use) cell you should be able to form some kind of sensible(-ish) comparison.