ansaurus

Question

Managing highly repetitive code and documentation in Java

Answer 1

+1 A:

A lot of this kind of repetition can now be avoided thanks to generics. They're a godsend when writing the same code where only the types change.

Sadly though, I think generic arrays are still not very well supported. For now at least, use containers that allow you to take advantage of generics. Polymorphism is also a useful tool to reduce this kind of code duplication.

To answer your question about how to handle code that absolutely must be duplicated... Tag each instance with easily searchable comments. There are some java preprocessors out there, that add C-style macros. I think I remember netbeans having one.

patros 2010-02-25 20:08:29

"Sadly though, I think generic arrays are still not very well supported." -- I'm not sure how you can support generic arrays in Java with type-erasure. I think it's impossible.

polygenelubricants 2010-02-25 20:18:54

I've seen a workarounds. Casting an array of Object or using reflection. Neither one is pretty, but they apparently work.

patros 2010-02-25 20:26:31

Generally, it's best to avoid using arrays in Java. ArrayLists provide a lot more functionality and usually have a negligible performance cost compared to an array.

Jared Levy 2010-04-28 03:33:23

Answer 2

+13 A:

If you absolutely must duplicate code, follow the great examples you've given and group all of that code in one place where it's easy to find and fix when you have to make a change. Document the duplication and, more importantly, the reason for the duplication so that everyone who comes after you is aware of both.

Bill the Lizard 2010-02-25 20:10:52

+1 Increasing the length of the duplicated documentation by documenting the duplication seems like it might be a bad idea at first, but it's really much worse to have duplicated stuff that needs to be modified and no documentation about the duplication.

Tanzelax 2010-02-25 20:32:22

Answer 3

+6 A:

From Wikipedia Don't Repeat Yourself (DRY) or Duplication is Evil (DIE)

In some contexts, the effort required to enforce the DRY philosophy may be greater than the effort to maintain separate copies of the data. In some other contexts, duplicated information is immutable or kept under a control tight enough to make DRY not required.

There is probably no answer or technique to prevent problems like that.

stacker 2010-02-25 20:20:25

Answer 4

+1 A:

I get that Sun has to document like this for the Java SE library code and maybe other 3rd party library writers do as well.

However, I think it is an utter waste to copy and paste documentation throughout a file like this in code that is only used in house. I know many people will disagree because it will make their in house JavaDocs look less clean. However, the trade off is that is makes their code more clean which, in my opinion, is more important.

jerry 2010-02-25 20:23:25

Answer 5

+3 A:

Java primitive types screw you, especially when it comes to arrays. If you're specifically asking about code involving primitive types, then I would say just try to avoid them. The Object[] method is sufficient if you use the boxed types.

In general, you need lots of unit tests and there really isn't anything else to be done, other than resorting to reflection. Like you said, it's another subject entirely, but don't be too afraid of reflection. Write the DRYest code you can first, then profile it and determine if the reflection performance hit is really bad enough to warrant writing out and maintaining the extra code.

noah 2010-02-25 20:26:36

Answer 6

+2 A:

You could use a code generator to construct variations of the code using a template. In that case, the java source is a product of the generator and the real code is the template.

msalib 2010-02-25 20:55:31

Yes, this is what I was alluding to when I said that perhaps Sun has its own preprocessor, etc.

polygenelubricants 2010-02-25 21:13:36

The officially sanctioned way to do this would be to use an annotation and an annotation processor so that when you compiled the code, javac would call your annotation processor which would in turn generate source code on the fly to be compiled by the compiler. The unofficial way to do it is to have your annotation processor modify internal compiler data structures when it is called.The only free java source generation library I've found is CodeModel.

msalib 2010-02-26 21:09:52

This seems reasonable for larger snippets, but a little duplication can be the lesser of two evils compered to adding yet another layer of complexity to the build process.

dsimcha 2010-03-06 19:14:55

Answer 7

+10 A:

For people that absolutely need performance, boxing and unboxing and generified collections and whatnot are big no-no's.

The same problem happens in performance computing where you need the same complex to work both for float and double (say some of the method shown in Goldberd's "What every computer scientist should know about floating-point numbers" paper).

There's a reason why Trove's TIntIntHashMap runs circles around Java's HashMap<Integer,Integer> when working with a similar amount of data.

Now how are Trove collection's source code written?

By using source code instrumentation of course :)

There are several Java libraries for higher performance (much higher than the default Java ones) that use code generators to create the repeated source code.

We all know that "source code instrumentation" is evil and that code generation is crap, but still that's how people who really know what they're doing (i.e. the kind of people that write stuff like Trove) do it :)

For what it is worth we generate source code that contains big warnings like:

/*
 * This .java source file has been auto-generated from the template xxxxx
 * 
 * DO NOT MODIFY THIS FILE FOR IT SHALL GET OVERWRITTEN
 * 
 */

Webinator 2010-02-26 04:06:51

Can you provide more details on what code generators they use, etc? I'm not familiar with Trove.

polygenelubricants 2010-02-26 04:10:38

It's explained in the Trove FAQ, basically they have an Ant target that calls a script that does the modification (if I remember correctly):http://trove4j.sourceforge.net/html/faq.html(I'm into Java high performance computing and I've seen the technique used several times... We use it here, we have our own Java proprietary code generating more Java code :)

Webinator 2010-02-26 04:14:57

@polygenelubricants: btw Trove is a wonderful replacement for the default Java API if you need to work with primitives. For regular collections, then you'll want to look into Javolution or the Google collections etc. The default Java collections are really pretty bad from a lot of standpoints. It works for simple project but they show their limits quite fast once you start to manipulate important amount of data.

Webinator 2010-02-26 04:17:54

I happen to like code generation... it needn't be nasty at all. But I would be generating byte code rather than Java source. What if you need to generate at run time, are you going to force end-users to install the JDK?

CurtainDog 2010-02-26 04:49:14

@CurtainDog: there's a reason while projects like Trove generate source code and not bytecode. There are cases where bytecode instrumentation is fine and cases where source code instrumentation is better. For what it's worth in the current project I'm working we do both so... Another option if you *really want* source code instrumentation at runtime (instead of bytecode) is simply to generate the .java server side, compile it, and send it down the wire. I'm not saying you should do this in that later case: I'm saying not only both have their use but both are commonly used.

Webinator 2010-02-26 06:41:31

Answer 8

A:

The best way to manage repetition is to avoid architectures that require it. Keep the code clean, simple and OO.

With respect to documentation the best documentation is the method signature. If you make use of appropriate types then documentation is not nearly as much of a problem.

CurtainDog 2010-02-26 04:59:13

Answer 9

+1 A:

Given two code fragments that are claimed to be similar, most languages have limited facilities for constructing abstractions that unify the code fragments into a monolith. To abstract when your language can't do it, you have to step outside the language :-{

The most general "abstraction" mechanism is a full macro processor which can apply arbitrary computations to the "macro body" while instantiating it (think Post or string-rewriting system, which is Turing capable). M4 and GPM are quintessential examples. The C preprocessor isn't one of these.

If you have such a macro processor, you can construct an "abstraction" as a macro, and run the macro processor on your "abstracted" source text to produce the actual source code you compile and run.

You can also use more limited versions of the ideas, often called "code generators". These are usually not Turing capable, but in many cases they work well enough. It depends on how sophisticated your "macro instantiation" needs to be. (The reason people are enamored with the C++ template mechanism is ths despite its ugliness, it is Turing capable and so people can do truly ugly but astonishing code generation tasks with it). Another answer here mentions Trove, which is apparantly in the more limited but still very useful category.

Really general macro processors (like M4) manipulate just text; that makes them powerful but they don't handle the structure of programming language well, and it is really awkward to write a generaor in such a mcaro processor that can not only produce code, but optimize the generated result. Most code generators that I encounter are "plug this string into this string template" and so cannot do any optimization of a generated result. If you want generation of arbitrary code and high performance to boot, you need something that is Turing capable but understands the structure of the generated code so it can easily manipulate (e.g., optimize) it).

Such a tool is called a Program Transformation System. Such a tool parses the source text just like a compiler does,and then carries analyses/transformations on it to achieve a desired effect. If you can put markers in the source text of your program (e.g, structured comments or annotations in langauges that have them) directing the program transformaiton tool what to do, then you can use it to carry out such abstraction instantiation, code generation, and/or code optimization. (One poster's suggestion of hooking into the Java compiler is a variation on this idea). Using a general puprose transformation system (such as DMS Software Reengineering Tookit means you can do this for essentially any language.

Ira Baxter 2010-03-06 16:30:28

Answer 10

A:

Dude, I am not a java developer but I know there is a concept in C++ called template to solve these type of problem. Try searching for similar concept in java. I believe java support this concept. If it does it will solve your problem.

Else you may refer C++ to get an idea of how templates work.

Hope this will help you.

RAHUL PRASAD 2010-03-08 19:36:15

I recommend you read all the answers to the questions as the use of generics will not fix this problem.

Adam Gent 2010-04-28 23:09:24

Answer 11

+1 A:

Even fancy pants languages like Haskell have repetitive code (see my post on haskell and serialization)

It seems there are three choices to this problem:

Use reflection and loose performance
Use preprocessing like Template Haskell or Caml4p equivalent for your language and live with nastiness
Or my personal favorite use macros if your language supports it (scheme, and lisp)

I consider the macros different than preprocessing because the macros are usually in the same language that the target is where as preprocessing is a different language.

I think Lisp/Scheme macros would solve many of these problems.

Adam Gent 2010-04-28 01:02:00

ansaurus

tags:

views:

answers:

Managing highly repetitive code and documentation in Java

related questions