tags:

views:

157

answers:

4

Suppose I have this (outrageously) simplified XML schema:

<xsd:complexType name="Person">
  <xsd:sequence>
    <xsd:element ref="FirstName"/>
    <xsd:element ref="FamilyName"/>
  </xsd:sequence>
</xsd:complexType>

If I generate a Java class from it I get something like this:

public class Person {
  protected FirstName firstName;
  protected FamilyName familyName;

  // and the usual getters and setters
}

This class smells awfully like a Data Class and I'd like to add behavior to it. Extending it appears to me as the most obvious solution, but can I always count on such Java classes to be safely extensible? Will it always be safe to do?

A related question: how would you name the augmented class? Would you give it the same name as the original class but in another package? Would you call it something like MyPerson?

+4  A: 

Mixing auto-generated with hand-crafted code always smells for trouble. If the schema is altered and the class re-generated, your own custom class will break.

I would avoid extending the auto-generated class. If you do need to add functionality to it, prefer composition over inheritance. That means create a MyPerson class that includes a Person as a field object. If the xml schema is modified and the Person class re-generated, then MyPerson class will again break, but:

  • With carefully design the breaking changes won't affect code outside the MyPerson class. If you opt for inheritance and a method changes its name, you would need to make changes in all the callers of your class.
  • It will be easier to fix the breaking changes. The compiler will give you clear descriptions of the missing methods.
kgiannakakis
A: 

If you want to do this kind of thing, I suggest that you take a good look at the Eclipse Modelling Framework (EMF).

The EMF toolset can take an XSD and use it to extract an EMF model and then generate Java classes. You can modify the generated classes and provided you follow a simple rule, your modifications will not be lost when you change the XSD / Model and regenerate the classes.

Each modifiable member declaration in the generated code is preceded by a Java comment which marks the member as generated. If you want to modify a member, you remove this comment and make your changes. Next time you regenerate, the generator performs a member by member comparison of the old and new versions of each class, looking for members whose signatures match, and updating them depending on the presence of the "generated" marker comments in the "old" version. This works surprisingly well. You occasionally have to do a bit of manual tidyup (e.g. to remove imports that are no longer required) but provided you remembered to delete the marker comments you won't lose your changes. (But it is a good idea to check in the generated code anyway ... if only to version control your changes!)

If you don't like the code that EMF generates, you there are many code generation options you can tweak in your Model's associated GenModel, or you can modify or replace the JET templates that comprise the EMF source code generator.

In addition to generating the classes that represent your XML in memory, EMF gives you an XML serializer / deserializer and an extensible / tailorable GUI based editor for your data structures. Related EMF projects include facilities for persisting your data into databases, augmenting your Model with validation rules, transactions, queries and comparisons. And there much more in the related Eclipse Modelling projects.

There is a whole stack of white-papers, tutorials and other documentation on the EMF documents page

Stephen C
A: 

Independently from if the class was autogenerated or not: the smell is not awful but sweet. I'd never ever add behavior to a data class but create a/many separate class/es for behavior. The intention of this class is to provide the xml coded information as a java object. And the class shouldn't do anything else. That keeps the code clear and understandable.

You could rename the 'Person' class to 'FullPersonName' (or something else that describes the real content of the data class) and reserve the 'Person' class name for another class, that describes 'a Person' with 'a FullPersonName' (=composition).

Edit:

This might be controversial. Gene Garcia adds Data classes to his 'smells' list and suggests moving behavior into it. The Clean Code authors encourage the readers to keep the concern separated and to not create hybrid classes. Me, I like clean code.

Andreas_D
A: 

For the record, I just found that the Unofficial JAXB Guide does indeed recommend subclassing a JAXB-generated class when new behaviour is required. They do, however, caution that "Adding behaviors to the generated code is one area that still needs improvement."

lindelof