views:

1494

answers:

5

Does anyone know of recent academic work which has been done on logo recognition in images? Please answer only if you are familiar with this specific subject (I can search Google for "logo recognition" myself, thank you very much). Anyone who is knowledgeable in computer vision and has done work on object recognition is welcome to comment as well.

Update: Please refer to the algorithmic aspects (what approach you think is appropriate, papers in the field, whether it should work(and has been tested) for real world data, efficiency considerations) and not the technical sides (the programming language used or whether it was with OpenCV...) Work on image indexing and content based image retrieval can also help.

+2  A: 

I worked on a project where we had to do something very similar. At first I tried using Haar Training techniques using this software

OpenCV

It worked, but was not an optimal solution for our needs. Our source images (where we were looking for the logo) were a fixed size and only contained the logo. Because of this we were able to use cvMatchShapes with a known good match and compare the value returned to deem a good match.

sberry2A
Please see my update
liza
+8  A: 

You could try to use local features like SIFT here: http://en.wikipedia.org/wiki/Scale-invariant_feature_transform

It should work because logo shape is usually constant, so extracted features shall match well.

The workflow will be like this:

  1. Detect corners (e.g. Harris corner detector) - for Nike logo they are two sharp ends.

  2. Compute descriptors (like SIFT - 128D integer vector)

  3. On training stage remember them; on matching stage find nearest neighbours for every feature in the database obtained during training. Finally, you have a set of matches (some of them are probably wrong).

  4. Seed out wrong matches using RANSAC. Thus you'll get the matrix that describes transform from ideal logo image to one where you find the logo. Depending on the settings, you could allow different kinds of transforms (just translation; translation and rotation; affine transform).

Szeliski's book has a chapter (4.1) on local features. http://research.microsoft.com/en-us/um/people/szeliski/Book/

P.S.

  1. I assumed you wanna find logos in photos, for example find all Pepsi billboards, so they could be distorted. If you need to find a TV channel logo on the screen (so that it is not rotated and scaled), you could do it easier (pattern matching or something).

  2. Conventional SIFT does not consider color information. Since logos usually have constant colors (though the exact color depends on lightning and camera) you might want to consider color information somehow.

overrider
Thanks. This approach sounds reasonable. Regarding the nearest neighbor for every feature - that sounds pretty intensive (I'm planning on having thousands of logos to be recognized), what would you think is a good way of optimizing? I thought of vector quantization or approximate nearest neighbors...
liza
liza, you are right, it is hard to find the NN in 128D. The current state-of-the-art is approximate NN search via kd-tree or k-means tree forest. It is implemented in Muja-Lowe FLANN: http://people.cs.ubc.ca/~mariusm/index.php/FLANN/FLANN
overrider
Thanks again. Also found these papers dealing with scalable and efficient image recognition:* "Small Codes and Large Image Databases for Recognition" by Torralba,Fergus,Weiss* "Scalable Recognition with a Vocabulary Tree" by Nister and Stewenius
liza
http://www.vlfeat.org/ has an implementation of SIFT for MATLAB and C (along with some other computer vision algorithms)
srand
+2  A: 

Worked on that: Trademark matching and retrieval in sports video databases get a PDF of the paper: http://scholar.google.it/scholar?cluster=9926471658203167449&hl=en&as_sdt=2000

We used SIFT as trademark and image descriptors, and a normalized threshold matching to compute the distance between models and images. In our latest work we have been able to greatly reduce computation using meta-models, created evaluating the relevance of the SOFT points that are present in different versions of the same trademark.

I'd say that in general working with videos is harder than working on photos due to the very bad visual quality of the TV standards currently used.

Marco

Marco
+1  A: 

I have done work in the area of object recognition go to http://www.generalpicturerecognition.com you can download the software however version 2 will not work for logo recognition. Version 3 has a function for finding collections of SINGLE objects on white background called small mode, but the large object and the very samall object have to be the same. However they can be at any location but must have the same rotation.

The general picture recognition software works well when trying to find the MAIN object in a picture because it attempt to eliminate the background as much as possible. This itself is very difficult to achieve. It uses many methods to identify the object and other patterns and objects in the picture. Its main purpose is to attempt to find other pictures with the same object or pictures with a similar scene. Sounds easy but very complex to achieve. However it is very easy just to find the exact same picture its when the object is say a castle but looks different from the sample castle in shape size and colour it becomes complex.

If you wish to write your own software it will take years because of the complexity outside of just the picture recognition side, not that I wish to put you off but I would suggest finding some picture recognition software that is dedicated to the task.

Anyway have a look at the general picture recognition software you will see what I mean by complexity.

Paul
A: 

Hi there,

does anyone heard of a VQ implementation in C or C++ ? I'd like to vector quantize Surf descriptors but don't know how to practically do it.

Thx

Jayka