views:

635

answers:

12

What's best practice for reuse of code versus copy/paste?

The problem with reuse can be that changing the reused code will affect many other pieces of functionality.

This is good & bad : good if the change is a bugfix or useful enhancement. Bad if other reusing code unexpectedly becomes broken because it relied on the old version (or the new version has a bug).

In some cases it would seem that copy/paste is better - each user of the pasted code has a private copy which it can customize without consequences.

Is there a best practice for this problem; does reuse require watertight unit tests?

+10  A: 

Every line of code has a cost.

Studies show that the cost is not linear with the number of lines of code, it's exponential.

Copy/paste programming is the most expensive way to reuse software.

"does reuse require watertight unit tests?"

No.

All code requires adequate unit tests. All code is a candidate for reuse.

S.Lott
+1 for readability and noting about the cost of code.
altCognito
-1 for gross misuse of "exponential". It's in fact not even quadratic.
Michael Borgwardt
COCOMO looks exponential to me: effort = 2.5 Ksloc ** 1.05. Isn't that exponential?
S.Lott
+1  A: 

You should be writing unit tests, and while yes, having cloned code can in some sense give you the sense of security that your change isn't effecting a large number of other routines, it is probably a false sense of security. Basically, your sense of security comes from an ignorance of knowing how the code is used. (ignorance here isn't a pejorative, just comes from as a result of not being able to know everything about the codebase) Get used to using your IDE to learn where the code is being use, and get used to reading code to know how it is being used.

altCognito
+2  A: 

Copy and pasting is never good practice. Sometimes it might seem better as a short-term fix in a pretty poor codebase, but in a well designed codebase you will have the following affording easy re-use:

  • encapsulation
  • well defined interfaces
  • loose-coupling between objects (few dependencies)

If your codebase exhibits these properties, copy and pasting will never look like the better option. And as S Lott says, there is a huge cost to unnecessarily increasing the size of your codebase.

DanSingerman
+3  A: 

It seems to me that a piece of code that is used in multiple places that has the potential to change for one place and not for another place isn't following proper rules of scope. If the "same" method/class is needed by two different things to do two different functions, then that method/class should be split up.

Don't copy/paste. If it does turn out that you need to modify the code for one place, then you can extend it, possibly through inheritance, overloading, or if you must, copying and pasting. But don't start out by copy-pasting similar segments.

Sean Nyman
+3  A: 

Using copy and paste is almost always a bad idea. As you said, you can have tests to check in case you break something.

The point is, when you call a method, you shouldn't really care about how it works, but about what it does. If you change the method, changing what it does, then it should be a new method, or you should check wherever this method is called.

On the other side, if the change doesn't modify WHAT the method does (only how), then you shouldn't have a problem elsewhere. If you do, you've done something wrong...

Samuel Carrijo
+2  A: 

Is there a best practice for this problem; does reuse require watertight unit tests?

Yes and sort of yes. Rewriting code you have already did right once is never a good idea. If you never reuse code and just rewrite it you are doubling you bug surface. As with many best practice type questions Code Complete changed the way I do my work. Yes unit test to the best of your ability, yes reuse code and get a copy of Code Complete and you will be all set.

Copas
+1  A: 

Where you write:

The problem with reuse can be that changing the reused code will affect many other pieces of functionality. ... In some cases it would seem that copy/paste is better - each user of the pasted code has a private copy which it can customize without consequences.

I think you've reversed the concerns related to copy-paste. If you copy code to 10 places and then need to make a slight modification to behavior, will you remember to change it in all 10 places?

I've worked on an unfortunately large number of big, sloppy codebases and generally what you'll see is the results of this - 20 versions of the same 4 lines of code. Some (usually small) subset of them have 1 minor change, some other small (and only partially intersecting subset) have some other minor change, not because the variations are correct but because the code was copied and pasted 20 times and changes were applied almost, but not quite consistently.

When it gets to that point it's nearly impossible to tell which of those variations are there for a reason and which are there because of a mistake (and since it's more often a mistake of omission - forgetting to apply a patch rather than altering something - there's not likely to be any evidence or comments).

If you need different functionality call a different function. If you need the same functionality, please avoid copy paste for the sanity of those who will follow you.

Steve B.
+1  A: 

There are metrics that can be used to measure your code, and it's up to yo (or your development team) to decide on an adequate threshold. Ruby on Rails has the "Metric-Fu" Gem, which incorporates many tools that can help you refactor your code and keep it in tip top shape.

I'm not sure what tools are available for other laguages, but I believe there is one for .NET.

Mike Trpcic
+2  A: 

So the consumer (reuser) code is dependent on the reused code, that's right.

You have to manage this dependency.

It is true for binary reuse (eg. a dll) and code reuse (eg. a script library) as well.

  • Consumer should depend on a certain (known) version of the reused code/binary.

  • Consumer should keep a copy of the reused code/binary, but never directly modify it, only update to a newer version when it is safe.

  • Think carefully when you modify resused codebase. Branch for breaking changes.

  • If a Consumer wants to update the reused code/binary then it first has to test to see if it's safe. If tests fail then Consumer can alway fall back to the last known (and kept) good version.

So you can benefit from reuse (eg. you have to fix a bug in one place), and still you're in control of changes. But nothing saves you from testing whenever you update the reused code/binary.

Vizu
+1  A: 

Copy/Paste leads to divergent functionality. The code may start out the same but over time, changes in one copy don't get reflected in all the other copies where it should.

Also, copy/paste may seem "OK" in very simple cases but it also starts putting programmers into a mindset where copy/paste is fine. That's the "slippery slope". Programmers start using copy/paste when refactoring should be the right approach. You always have to be careful about setting precedent and what signals that sends to future developers.

There's even a quote about this from someone with more experience than I,

"If you use copy and paste while you're coding, you're probably committing a design error."
-- David Parnas

Mark
+2  A: 

One very appropriate use of copy and paste is Triangulation. Write code for one case, see a second application that has some variation, copy & paste into the new context - but you're not done. It's if you stop at that point that you get into trouble. Having this code duplicated, perhaps with minor variation, exposes some common functionality that your code needs. Once it's in both places, tested, and working in both places, you should extract that commonality into a single place, call it from the two original places, and (of course) re-test.

If you have concerns that code which is called from multiple places is introducing risk of fragility, your functions are probably not fine-grained enough. Excessively coarse-grained functions, functions that do too much, are hard to reuse, hard to name, hard to debug. Find the atomic bits of functionality, name them, and reuse them.

Carl Manaster
Dolphin
A: 

In general, copy and paste is a bad idea. However, like any rule, this has exceptions. Since the exceptions are less well-known than the rule I'll highlight what IMHO are some important ones:

  1. You have a very simple design for something that you do not want to make more complicated with design patterns and OO stuff. You have two or three cases that vary in about a zillion subtle ways, i.e. a line here, a line there. You know from the nature of the problem that you won't likely ever have more than 2 or 3 cases. Sometimes it can be the lesser of two evils to just cut and paste than to engineer the hell out of the thing to solve a relatively simple problem like this. Code volume has its costs, but so does conceptual complexity.

  2. You have some code that's very similar for now, but the project is rapidly evolving and you anticipate that the two instances will diverge significantly over time, to the point where trying to even identify reasonably large, factorable chunks of functionality that will stay common, let alone refactor these into reusable components, would be more trouble than it's worth. This applies when you believe that the probability of a divergent change to one instance is much greater than that of a change to common functionality.

dsimcha
Dolphin
dsimcha