views:

400

answers:

9

I am working on a real estate website and i would like to write a program that can figure out(classify) if an image is a floor plan or a company logo.

Since i am writing in php i will prefer a php solution but any c++ or opencv solution will be fine as well.

Floor Plan Sample:

alt text

alt text

Logo Sample:

alt text

+1  A: 

I highly doubt any such tool already exists, and creating anything accurate would be non-trivial. If your need is to sort out a set of existing images ( e.g., you have an unsorted directory ), then you might be able to write a "good enough" tool and manually handle the failures. If you need to do this dynamically with new imagery, it's probably the wrong approach.

Were I to attempt this for the former case, I would probably look for something trivially different I can use as a proxy. Are floor plans typically a lot larger then logos ( in either file size or image dimensions )? Do floor plans have less colors then a logo? If I can get 75% accuracy using something trivial, it's probably the way to go.

Chris Arguin
+1 - another simple indicator would be keywords in the filename like "logo" or "floor" :)
Paul Dixon
thanks for the answer but i tried this approach a lot of logos are same size as floor plans. any other indicator that can be used? also please see my comment above about the way i was thinking you can classify an image as a floor plan (using the room corners)
Logos tend to be wider than taller?
Alix Axel
+4  A: 

As always, there is a built-in PHP function for this. Just joking. =)

All the floor plans I've seen they are pretty monochromatic, I think you can play with the number of colors and color saturation to have a pretty good guess is the image is a logo or a floor plan.

E.g.: is the image has less than 2 or 3 colors is a floor plan.

E.g.: if the sum / average of the saturation is less than X it's a floor plan.

Black and white (and other similar colors that are used in floor plans) have a saturation that is zero, or very close to zero, while logos tend to be more visually attractive, hence use more saturated colors.

Here is a simple function to compute the saturation of a Hex RGB color:

function Saturation($color)
{
    $color = array_map('hexdec', str_split($color, 2));

    if (max($color) > 0)
    {
     return (max($color) - min($color)) / max($color);
    }

    return 0;
}

var_dump(Saturation('000000')); // black    0.0000000000000000
var_dump(Saturation('FFFFFF')); // white    0.0000000000000000
var_dump(Saturation('818185')); // grey     0.0300751879699249
var_dump(Saturation('5B9058')); // green    0.3888888888888889
var_dump(Saturation('DE1C5F')); // pink     0.8738738738738738
var_dump(Saturation('FE7A15')); // orange   0.9173228346456692
var_dump(Saturation('FF0000')); // red      1.0000000000000000
var_dump(Saturation('80FF80')); // ---      0.4980392156862745
var_dump(Saturation('000080')); // ---      1.0000000000000000

Using imagecolorat() and imagecolorsforindex() you can implement a simple function that loops trough all the pixels of the image and sums / computes the average of the saturation. If the image has a saturation level above of a custom threshold you define you can assume that the image is a logo.

One thing you shouldn't forget is that images that have a higher resolution will normally have more saturation (more pixels to sum), so for the sake of this algorithm and also for the sake of your server performance it would be wise to resize all the images to a common resolution (say 100x100 or 50x50) to classify them and once classified you could use the original (non-resized) images.

I made a simple test with the images you provided, here is the code I used:

$images = array('./44199.jpg', './68614.jpg', './95205.jpg', './logo.png', './logo.gif');

foreach ($images as $image)
{
    $sat = 0;
    $image = ImageCreateFromString(file_get_contents($image));

    for ($x = 0; $x < ImageSX($image); $x++)
    {
     for ($y = 0; $y < ImageSY($image); $y++)
     {
      $color = ImageColorsForIndex($image, ImageColorAt($image, $x, $y));

      if (is_array($color) === true)
      {
       $sat += Saturation(dechex($color['red']) . dechex($color['green']) . dechex($color['blue']));
      }
     }
    }

    echo ($sat / (ImageSX($image) * ImageSY($image)));
    echo '<hr />';
}

And here are the results:

green floor plant:      0.0151028053
black floor plant:      0.0000278867
black and white logo:   0.1245559912
stackoverflow logo:     0.0399864136
google logo:            0.1259357324

Using only these examples, I would say the image is a floor plant if the average saturation is less than 0.03 or 0.035, you can tweak it a little further by adding extra examples.

Alix Axel
Except if it's a floor plan with a company logo :) But this would be my approach, too. +1
Pekka
@Pekka: Still, by resizing, the logo would be pretty much ignored and would contribute just some insignificant points to the saturation. And if it isn't it means it's a logo with a floor plan and not a floor plan with a logo. =)
Alix Axel
@Pekka: Also, instead of using the sum he could use the average to account for this little artifacts.
Alix Axel
It's a good idea but have to come with combination of something else
I run it against my test data, accuracy is very high and if there will be no other solution i will choose this, but still alot of logos are almost pure black and white text so the Saturation was low.
+1  A: 

Stuff like this - recoginition of patterns in images - tends to be horribly expensive in terms of time, horribly unreliable and in constant need of updating and patching to match new cases.

May I ask why you need to do this? Is there not a point in your website's workflow where it could be determined manually whether an image is a logo or a floor plan? Wouldn't it be easier to write an application that lets users determine which is which at the time of upload? Why is there a mixed set of data in the first place?

Pekka
i am getting the data from the clients as a batch of unsorted images.Since it contains thousand of images (of the properties for sale the floor plans and the company logos) when i display each needs to be classified, if possible automatic (so i can use it in the future), i already wrote the part to classify the photos of the properties in 95% accuracy so now out of the remaining images i am left with logos and floorplans.
I see. that's pretty impressive already. Still, I think the task at hand is really more prone to a high rate of errors. I personally would go for a fully manual procedure, creating an interface that makes it easy to point and click what is what. But if you go automatic - maybe using one of the very interesting suggestions posted here - I'm sure many people (including me) would be interested to learn how it worked out.
Pekka
A: 

Despite thinking this is something that requires manual intervention, one thing you could do is check the size of the image.

A small (both in terms of MB and dimensions) image is likely to be a logo.

A large (both in terms of MB and dimensions) image is likely to be a floorplan.

However, this would only be a probability measurement and by no means foolproof.

The type of image is also an indicator, but less of one. Logos are more likely to be JPG, PNG or GIF, floorplans are possibly going to be TIFF or other lossless format - but that's no guarantee.

ChrisF
A: 

As others have said, such image recognition is usually horribly complex. Forget PHP.

However, looking over your samples I see a criteria that MIGHT work pretty well and would be pretty easy to implement if it did:

Run the image through good OCR, see what strings pop out. If you find a bunch of words that describe rooms or such features...

I'd rotate the image 90 degrees and try again to catch vertical labels.

Edit: Since you say you tried it and it doesn't work maybe you need to clean out the clutter first. Slice the image up based on whitespace. Run the OCR against each sub-image in case it's getting messed up trying to parse the lines. You could test this manually using an image editor to slice it up.

Loren Pechtel
tried that which OCR tools would you recommend? i tried the tesseract and it wasnt able to figure out the text
Sorry, but I can't help with tools. I haven't dealt with OCR enough to know what might do it.
Loren Pechtel
I think recognizing the characters in a company logo is by itself a complicated endeavor.
Hao Wooi Lim
Who cares if you can recognize characters in the logo? My approach is based on identifying labels on the floorplan--if you don't find anything you figure it's a logo.
Loren Pechtel
+1  A: 

One of the first things that comes to mind is the fact that floor plans tend to have considerably more lines oriented at 90 degrees than any normal logo would.

A fast first-pass would be to run Canny edge detection on the image and vote on angles using a Hough transform and the rho, Theta definition of a line. If you see a very strong correspondence for Theta=(0, 90, 180, 270) summed over rho, you can classify the image as a floor plan.

Another option would be to walk the edge image after the Canny step to only count votes from long, continuous line segments, removing noise.

Michael Roberts
Any idea how to write a program that does that? Or can you send me to a place that can explain this stuff so i can write it myself
A: 

Use both color saturation and image size (both suggested separately in previous answers). Use a large sample of human-classified figures and see how they plot in the 2-D space (size x saturation) then decide where to put the boundary. The boundary needs not be a straight line, but don't make too many twists trying to make all the dots fit, or you'll be "memoryzing" the sample at the expense of new data. Better to find a relatively simple boundary that fits most of the samples, and it should fit most of the data.

You have to tolerate a certain error. A foolproof solution to this is impossible. What if I choose a floorplan as my company's logo? (this is not a joke, it just happens to be funny)

Emilio M Bumachar
+1  A: 

A simple no-brainer attempt I would first try would be to use SVM to learn the SIFT keypoints obtained from the samples. But before you can do that, you need to label a small subset of the images, giving it either -1 (a floor plan) or 1 (a logo). if an image has more keypoints classified as a floor plan then it must be a floorplan, if it has more keypoints classified as a logo then it must be a logo. In Computer Vision, this is known as the bag-of-features approach, also one of the simplest methods around. More complicated methods will likely yield better results, but this is a good start.

Hao Wooi Lim
Any idea how to write a program that does that?Or can you send me to a place that can explain this stuff so i can write it myself
@tomlei: Perhaps you could check out a paper on this entitled "Visual Categorization with Bags of Keypoints" by Gabriella Csurka etc.
Hao Wooi Lim
+2  A: 

It may be easiest to outsource this to humans.

If you have a budget, consider Amazon's Mechanical Turk. See Wikipedia for a general description.

Alternatively, you could do the outsourcing yourself. Write a PHP script to display one of your images and prompt the user to sort it as either a "logo" our "floorplan." Once you have this running on a webserver, email your entire office ånd ask everyone to sort 20 images as a personal favor.

Better yet, make it a contest-- the person who sorts the most images will win an ipod!

Perhaps most simply, invite everyone you know over for pizza and beers and setup a bunch of laptops and get everyone to spend a few minutes sorting.

There are software ways to accomplish your task, but if it is a one-off event with less than a few thousand images and a budget of at least a few hundred dollars, than I think your life may be easier using humans.

AndyL