views:

152

answers:

6

I have a large code base and there is lots of repeated, or nearly repeated code all over the place, it's about as unDRY as code can get, but tracking the "duplicates" is hard, so I was wondering if there are any tools for finding potential DRYable code, something like a diff tool or a Hamming distance analizer, don't need language specific knowledge or anything like that.

So any clues as too a tool like this?

+2  A: 

If you're working in ruby, then you can try this.

Brian
Those are some nifty tools, I'd long considered something like them but never got around to it
Robert Gould
+1  A: 

Duplo (open source) works in C, C++, Java, C# and VB.Net. I tried it once, and it found enough duplicated code to keep me employed for a long time.

I've heard of Simian (commercial) but have not tried it.

Josh Kelley
Duplo looks great, thats more or less what I was looking for
Robert Gould
+3  A: 

Clone Detective for Visual Studio

mcintyre321
Nice tool for C# it seems actually rather intelligent, didn't have my expectations high (purely lexical was ok), but thats nice
Robert Gould
+1  A: 

I use Simian in VS. It's pretty good, not great.

Matt Grande
+1  A: 

Clone Dr from Semantic Designs is a commercial product that finds duplicate code in a large number of different programming languages. http://www.semdesigns.com/Products/Clone/index.html

Large companies can afford this product. Individuals ... not so much. I wish there were some open source projects out there like this. Might be a fun project to work on. If we only knew of a community of programmers with some time on their hands ...

Kurt W. Leucht
Some time? I'm the author, and I've been working on CloneDR on and off for 10 years.
Ira Baxter
+1  A: 

Semantic Designs' CloneDR find exact and near-miss duplicate clones based on the langauge structure, so it isn't fooled by whitespace changes or line breaks, inserted/changed comments, or even modified variable names.

It leverages production parser front ends to work with C, C++, C#, Java, COBOL, PHP, Python, Fortran, Ada, ...

There are a number of example Clone analysis reports at the web site for various languages.

Ira Baxter