tags:

views:

195

answers:

6

I have a sentence which is analyzed in different phases. First, I get some attributes (say, X, Y, Z):

public class AnalyzedSentence {
    private String X;
    private String Y;
    private String Z;

    public AnalyzedSentence(String sentence) {
        extractX();
        extractY();
        extractZ();
    }

    // getters, setters
}

Then, I use these attributes to further analyze the sentence to get another attribute, say, "XYZ", after which I create the following class:

public class FinalSentence {

    private AnalyzedSentence data;

    private String XYZ;

    public FinalSentence(String XYZ, AnalyzedSentence data) {
        this.data = data;
        this.XYZ = XYZ;
    }

    // getters, setters
}

The workflow goes like this:

public class SentenceAnalyzer {
    /// ...
    public FinalSentence analyze(String sentence) {
        AnalyzedSentence as = new AnalyzedSentence(sentence);  // every attribute of "as" can be calculated beforehand
        String XYZ = SpecialClass.extractXYZ(sentence, as); // extract XYZ (needs a special class), based on as
        return new FinalSentence(XYZ, as);
    }
}

Alternatively, I could have just a single class holding all the information, filling the attributes as they were extracted, which could result in some null results. It'd be like so:

public class Sentence {

    private String X;
    private String Y;
    private String Z;    
    private String XYZ;

    public Sentence(String sentence) {
        extractX();
        extractY();
        extractZ();
    }

    public String getXYZ() {
        // with this design, this method can be called, even if XYZ was not extracted yet.
        // remember that XYZ cannot be extracted as X,Y,Z
    }

    public void setXYZ(...) {...}

    // getters, setters
}

My question is: which design is preferred, and why ? If there's also a better way to accomplish what I'm trying to do here, I'd also like to hear it.

+2  A: 

Personally I prefer the first design, the one with two classes. The distinction between analysis and results is appealing to me. I like to think of classes as collections of responsibilities more than collections of data, and using two distinct classes makes the responsibility of each more clear.

Darryl
Although, the accepted definition of an object is a set of data and operations on that data. See http://en.wikipedia.org/wiki/Object_%28computer_science%29 (for lack of having a better reference to hand)
chrisbunney
That's the most common definition by far, but not the only correct one, and not one that informs design choices like the one proposed in this question. It's *a* definition, not *the* definition.
Darryl
A: 

I would go for the single class because i would treat extracting XYZ as a single logical operation with two steps.

In the constructor you can extract X, Y, Z and then use the Special class to get XYZ and return the final sentence.

chikak
A: 

Why not just a single method that returns an analyzed data based on single string parameter?

String analyze(String input) {...}

Less clutter, no side effects (everything done on stack), less chance of someone misunderstanding API and doing something strange with it.

Gregory Mostizky
+2  A: 

What you need to consider, is whether, in your problem domain, an AnalyzedSentence and FinalSentence are unique enough to be split or merged.

It's clear they're working with similar data and cooperating closely in order to achieve the goal.

To me, analysed and final are just states that a Sentence could be in, although that is based on my limited knowledge of the problem you're working on, so I would look to combine them in some way.

Edit Based on the further information, I think I would design it something like this:

The Sentence class encapsulates the original sentence, the tags, and the extracted category (or whatever it is you're extracting, I'm assuming it's a category based on your description), and the operations to set, get, and extract that information.

The Sentence class stores a TagList that contains all the tags, the original string, and the extracted category. It also encapsulates the extraction of the data by creating an Extractor and passing it the TagList when the data needs extraction (I've put it in the constructor, but it could go in a method, where it gets called depends on when you need to extract the data).

So in this way, everything required to manipulate the original sentence is in the Sentence class. Of course, you may know something that I don't that makes this approach unsuitable, but here's some code to illustrate what I mean:

public class Sentence {

    private TagList tags    
    private String category;
    private String sentence

    public Sentence(String newSentence) {
        sentence = newSentence;
        Extractor<TagList> e = new Extractor<TagList>()
        tags = e.extractTags(sentence);
        category = new Category(tags);
    }

    public String getXYZ() {

    }

    public void setXYZ(...) {...}

    private extractTags(String s){ ...}

    // getters, setters
}


public class TagList{

    private List<String> tags;

    ....
    //rest of class definition

}
chrisbunney
+1  A: 

I am all for making relatively small classes, as I have to struggle at work with monster classes of over 8,000 lines!
Now, the FinalSentence class seems a bit too small, like an empty shell, a simplistic facade for AnalyzedSentence, and its usefulness doesn't seem flagrant.

PhiLho
It's not just a facade for AnalyzedSentence, since it has more data, but I get your point, and it makes sense. However, the problem is that I need to pass the AnalyzedSentence around to get the attribute XYZ that is used to construct the FinalSentence.I could store the XYZ attribute in the AnalyzedSentence, but then the getXYZ() method would any work correctly _after_ I managed to extract the XYZ attribute. This is a problem in terms of API, since any client could instantiate an AnalyzedSentence and call getXYZ(), even though it would be null at the beginning.
JG
couldn't you code the getXYZ method in such a way that it if XYZ is null when the method is called it either: triggers the extraction of XYZ, or throws an exception?
chrisbunney
+1  A: 

The answer to your question depends on how you expect those class to be modified in the future, as the requirements of the program expand or change.

A good guideline for figuring out when to split/join a class, is the Single Responsibility Principle: "There should never be more than one reason for a class to change."

Also the Open Closed Principle helps in deciding how to organize classes so, that you can modify the behaviour of existing classes by combining them with new classes, instead of modifying the existing classes: "Software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification."

http://butunclebob.com/ArticleS.UncleBob.PrinciplesOfOod

Esko Luontola