views:

353

answers:

15

In general I come across this a lot. Some of my co-workers prefer very simple, easy to read classes even if that means that there is some code duplication, whereas I do everything in my power to avoid code duplication, even if it means making more a complicated architecture. What is the best practice? I work exclusively in Java.

+13  A: 

I always favor the solution with no duplication of code. Even if the more complicated architecture is harder to understand at first, the benefits to maintenance more than outweigh the learning curve.

rtperson
And it's in the programmers hand to create clean code so that even a complicated architecture becomes easy to understand. My opinion: your co-workers are lazy :-P
Andreas_D
+9  A: 

While both are good goals, I feel that readability is the absolute first requirement to have a maintainable codebase. I would always prefer simple, readable, and maintainable to a complete elimination of code duplication.

Reed Copsey
I agree. The question here would be why is the non-duplicated code harder to read/maintain?
PP
+3  A: 

I don't see how "generics" is related to your question. It's trivially, obviously wrong to have separate classes to represent a CollectionOfFoo and a CollectionOfBar, so that can't be what you're asking.

You'll probably have to provide an example for each point of view, but you'll still probably get closed for being subjective.

Jonathan Feinberg
Generics isn't just about collections; if I have multiple methods that do similar things but differ by parameter type, I prefer to replace them with a single method that takes a generic type. Some developers seem to find that confusing.
JacobM
You shouldn't have to lower the quality of your output for developers who are not as knowledgeable as yourself. This is coding for the lowest common denominator. Don't lower yourself like that, instead educate them on the concepts that they are confused by.
matt b
Couldn't agree more.
JacobM
+4  A: 

The main reason to avoid code duplication is maintainability. If a segment of code appears in multiple places, when it comes time to update you have to remember to change it everywhere. Forgetting to change one instance can cause big problems, which you may not notice immediately.

ThisSuitIsBlackNot
A: 

Before giving my answer I'd like to see some example code to see what the question really is.

While waiting for that, I believe that if your code is built in the most sane OO principles (such as each class does only one thing and one thing only) there shouldn't be any code duplication around. It certainly is possible to go nuts with abstractions etc. which do end up creating a huge pile of useless classes but I don't think that's the issue here.

Esko
+1  A: 

No. Neither of those situations is acceptable. Write it with generics, but only as complex as it needs to be.

Don't duplicate code; you will have to double-fix bugs, double-add enhancements, double-write comments, double-write tests. Every line of code you create is a small burden you will have to carry for as long as you work on that codebase; minimize your burden.

Alex Feinman
+2  A: 

This is a judgement call. Most programmers duplicate code too much, and I think that leads to the attitude among passionate developers that stamping out duplication is an absolute good, but it is not. Making your code easy to read should be the priority, and eliminating duplicate code is usually a good thing for readability, but not always.

Also, I wouldn't use commercially valuable code as a place to use unfamiliar language features for the purpose of learning them. Create separate learning projects for that purpose. You don't want to end up getting called into work on off-hours to fix bugs caused by getting too fancy with generics, or any other feature.

Parappa
+4  A: 

There are extreme cases where you prevent code duplication by complicated metaprogramming (not so much an issue for Java) or excessive use of reflection, and in those few cases I'd favor permitting the duplication. This is rare. So long as the code remains understandable by a reasonably skilled developer who isn't you, I'd go for eliminating the duplication.

I have run across situations where a team includes one or two skilled developers and a bunch of newbies, where the newbies try to prevent the use of coding approaches that they don't understand at a glance. This must be resisted.

JacobM
+4  A: 

Best practice: If the code is short, duplicate it twice, but never more.

So, if you have very similar snippets of code copy/pasted in 3 different places, consider refactoring.

Keep in mind, refactoring doesn't automaticaly mean making code more complicated. Consider the following:

class IntStack
{
    public int value;
    public IntStack next;
}

class StringStack
{
    public String value;
    public StringStack next;
}

class PersonStack
{
    public Person value;
    pubilc PersonStack Next;
}

Everytime you want a stack for a new datatype, you need to write a new class. Duplicating code works fine, but let's say you want to add a new method, maybe a "Push" method which returns a new stack? Alright, now you're forced to add it in a bajillion different places. Or you could use a generic Object stack, but then you'd lose type-safety. Generics will simplify the architecture:

class Stack<T>
{
    public T value;
    public Stack<T> next;
}

Cool!

Alright, how about this example:

class Logger
{
    int logtype;
    public Logger(int logtype) { ... }

    public void Log(string text)
    {
        if (logtype == FILE) { ... }
        elseif (logtype == DATABASE) { ... }
        elseif (logtype == CONSOLE) { ... }
    }

    public void Clear()
    {
        if (logtype == FILE) { ... }
        elseif (logtype == DATABASE) { ... }
        elseif (logtype == CONSOLE) { ... }
    }

    public void Truncate(int messagesToTruncate)
    {
        if (logtype == FILE) { ... }
        elseif (logtype == DATABASE) { ... }
        elseif (logtype == CONSOLE) { ... }
    }
}

Alright, so each time you add a method, you have to check what kind of logger you're using. Painful, and prone to bugs. Normally, you'd factor out an interface (probably with the methods Log, Clear, and Truncate), then create three classes (FileLogger, DatabaseLogger, ConsoleLogger).

More classes = more architecture. Is this easier or harder to maintain in the longer run? For this example, I'd say the code is now easier to maintain, but YMMV.

Juliet
Voted up for some useful examples.
JacobM
+1  A: 

Depends on a number of factors:

  • How much code is being duplicated? Not really a problem if the same five lines appear twice provided there is some justfiable reason for it. Avoid over architecting code, it may actually reduce maintainability in the long run because the next person working on the code may not appreciate all of the subtly in your architecture and bend it severely out of shape.
  • How many copies of the same code? Two isn't bad, but 10 (decimal) not so good.
  • Why is the code duplicated? I have run into a number of "duplications" that once all the requirements were built, turned out not to be duplications at all, just somewhat similar.

So my answer is maybe...

NealB
5 lines can be terrible though if its crazy date logic.
Nathan Feger
+2  A: 
snoopy
A: 

The primary reason why generics are hard to read is because it is unfamiliar to unexperienced programmers. This implies that great care must be taken in choice of naming and documentation in order to make clarity shine through.

It is extremely important that such core classes are well-named. The designer may want to discuss this thoroughly with peers before choosing the names.

Thorbjørn Ravn Andersen
A: 

Excellent question; the generic answers may not fit for your situation. There are many factors decides your choice

  1. Current code quality
  2. Your product/project stage, early/growth/mature
  3. Developers' experience and skill
  4. Your project/product time to market

These are some of the crucial factors decides your choice. In my experience I realize duplicate business logic (middle tier) code is not good practice but the presentation layer can have duplicate codes.

When I am writing this I remember the article "Does Slow Growth Equal Death". Writing quality and non-duplicate code might take time but that shouldn't be a bottleneck for your business.

Venkat
A: 

Usually the only reasons to duplicate code are to overcome a language's weak template/macro/generics system or for optimization, where there are different low-level function names that handle different types.

In C++, which you noted you don't work within, the template system allows zero-overhead type and function generation, with specialization available to "customize" the generated code for certain type parameters. Java has nothing of the sort, removing the both-general-and-specific choice avoidance available within C++.

In Common Lisp, one can employ macros and compiler macros to produce code similar to C++ templates, where basic cases can be customized for specific types, yielding different code expansions.

In Java, the only time I'd argue against generic code is for inner loops using numeric types. Paying the boxing and unboxing costs for numeric types in generic code is unacceptable in those cases, unless your profiler can convince you that the compiler or HotSpot was clever enough to forgo the boxing.

As for readability, well, make your code the model and the challenge to which your coworkers have to rise. Offer to educate those that have trouble reading it. Encourage them to review it for defects. If they find any, you can then demonstrate the benefit of only having to fix the one and only version.

seh
A: 

Compare it to normalization in relational databases - you probably want one place where some kind of function or data lives, not many places. It makes a lot of difference in terms of maintainability and the ability to reason about your code.

yawn