We all agree that duplication is evil and should be avoid (Don't Repeat Yourself principle). To ensure that, static analysis code should be used like Simian (Multi Language) or Clone Detective (Visual Studio add-in)
I just read Ayende's post about Kobe where he is saying that :
8.5% of Kobe is copy & pasted code. And that is with the sensitivity dialed high, if we set the threshold to 3, which is what I commonly do, is goes up to 12.5%.
I think that 3 as threshold is very low. In my company we offer quality code analysis as a service, our default threshold for duplication is set to 20 and there is a lot of duplications. I can't imagine if we set it to 3, it would be impossible for our customer to even think about correction.
I understand Ayende's opinion about Kobe: it's an official sample and is marketed as “intended to guide you with the planning, architecting, and implementing of Web 2.0 applications and services.” so the expectation of quality is high.
But for your project what minimum threshold do you use for duplication?
Related question : How fanatically do you eliminate Code Duplication?