+4  A: 

I doubt that this can be done with a simple local operation. Look at the rectangle you want to keep - there are several gaps, hence performing a local operation to remove short line segments would probably heavily reduce the quality of the desired output.

In consequence I would try to detect the rectangle as important content by closing the gaps, fitting a polygon, or something like that, and then in a second step discard the remaining unimportant content. May be the Hough transform could help.

UPDATE

I just used this sample application using a Kernel Hough Transform with your sample image and got four nice lines fitting your rectangle.

Daniel Brückner
+1 for suggesting the Hough transform. Just find the four strongest peaks in the transform space, and that's your quadrilateral.
erickson
+3  A: 

Before finding the edges pre-process the image with an open or close operation (or both), that is, erode followed by dilate, or dilate followed by erode. this should remove the smaller objects but leave the larger ones roughly the same.

I've looked for online examples, and the best I could find was on page 41 of this PDF.

tom10
Look at the example picture. The edge outline of the rectangle is only 1 pixel thin! If you erode first, you will completely lose the rectangle as well as the little edges. If you dilate first, you may close up some gaps in your large rectangle, but that is a different problem, and doesn't really help you to get rid of the small edges.
A. Levy
@Levy - No, as I clearly stated in my answer, the image should be closed BEFORE FINDING THE EDGES. Of course this shouldn't be applied to the edges (but to the objects from which the edges are calculated).
tom10
@tom10 - thanks for the advice, I switched to an open operation (replacing a Gaussian filter) and I get much better Canny edge detection output (the close operation results in better performance as well). I had thought of using erosion/dilation, but I was thinking of using after the edge detection, which doesn't work with thin lines.
@emi1faber - great... I'm glad to hear it's working.
tom10
+2  A: 

The Hough Transform can be a very expensive operation.

An alternative that may work well in your case is the following:

  1. run 2 mathematical morphology operations called an image close (http://homepages.inf.ed.ac.uk/rbf/HIPR2/close.htm) with a horizontal and vertical line (of a given length determined from testing) structuring element respectively. The point of this is to close all gaps in the large rectangle.

  2. run connected component analysis. If you have done the morphology effectively, the large rectangle will come out as one connected component. It then only remains iterating through all the connected components and picking out the most likely candidate that should be the large rectangle.

ldog
This is the way I would do it. If the goal is to find the location of the large rectangle, select the largest component. If the goal is just to remove the short edges (the noise) then remove all sufficiently small components, or all but the largest.
A. Levy
+2  A: 

Perhaps finding the connected components, then removing components with less than X pixels (empirically determined), followed by dilation along horizontal/vertical lines to reconnect the gaps within the rectangle

Amro
+1  A: 

It's possible to follow two main techniques:

  1. Vector based operation: map your pixel islands into clusters (blob, voronoi zones, whatever). Then apply some heuristics to rectify the segments, like Teh-Chin chain approximation algorithm, and make your pruning upon vectorial elements (start, endpoint, length, orientation and so on).

  2. Set based operation: cluster your data (as above). For every cluster, compute principal components and detect lines from circles or any other shape by looking for clusters showing only 1 significative eigenvalue (or 2 if you look for "fat" segments, that could resemble to ellipses). Check eigenvectors associated with eigenvalues to have information about orientation of the blobs, and make your choice.

Both ways could be easily explored with OpenCV (the former, indeed, falls under "Contour analysis" category of algos).

ZZambia