views:

344

answers:

2

After I found out that there's no plugin for Simian in Maven 2, we turned to CPD, but it doesn't perform as well as Simian (observed in our Ant projects that use both Simian and CPD). I know that developers in my team have been checking in redundant code but CPD isn't flagging anything. I wanted to try out another code duplication analysis tool.

I considered setting up Simian to be run via the antrun plugin, but due to the nature of Maven projects, setting up the project source directories seems to be tricky.

I was hoping to get some suggestions for such tools. I found other questions dealing with this, but I'm hoping for tools that can actually be smoothly integrated into the Maven build process.

Thanks all.

+1  A: 

I've not found any issues with the PMD CPD plugin, are you sure it is running on your code? Can you post some snippets that aren't being detected and your CPD configuration?

You could try using the exec-maven-plugin to invoke your Simian installation. You can pass parameters to the execution and output the results to Maven's target directory.

Note: This won't produce a nicely integrated report unless Simian does so, and any report produced won't by default be integrated into the Maven site, but at least you'd see the results.

Here's an example configuration for the exec plugin:

<plugins>
  <plugin>
    <groupId>org.codehaus.mojo</groupId>
    <artifactId>exec-maven-plugin</artifactId>
    <version>1.1</version>
    <executions>
      <execution>
        ...
        <goals>
          <goal>exec</goal>
        </goals>
      </execution>
    </executions>
    <configuration>
      <executable>[simian executable]</executable>
      <!-- optional -->
      <workingDirectory>/tmp</workingDirectory>
      <arguments>
        <argument>[argument 1]</argument>
        <argument>[argument 2]</argument>
        ...
      </arguments>
    </configuration>
  </plugin>
</plugins>
Rich Seller
Well, one example would be that there's this bunch of Exception classes (each with about 5 constructors) that are basically duplicates of each other except the name (yea, the horror...). Of course, I don't know if Simian will pick that up either, but CPD doesn't.But thanks for the tip, I'll give it a try.
aberrant80
It's probably worth doing a side by side comparison of the Simian and CPD results (outside of Maven) on a number of projects to see the differences between the differences. That should give you more confidence either that it's worth trying to integrate Simian or stick with CPD
Rich Seller
Yea, I did, Simian didn't produce any results either :D Oh well, we'll just stick with CPD and see how it goes. Thanks.
aberrant80
+1  A: 

Our CloneDR finds duplicate code, both exact copies and near-misses, across large source systems, parameterized by langauge syntax. It supports Java, C#, COBOL, C++, PHP and many other languages.

EDIT: OP notes that neither Simian nor CPD can find clones of Exception blocks which have different names. To do this, one needs a clone detector that can find near-miss clones, e.g, those in which parts of the clones have been modified. I don't know much about Simian or CPD. However, CloneDR has been finding such clones reliably since it was built in 1999.

See an example of detected parameterized clones.

Ira Baxter