views:

116

answers:

4

Where do you draw the line when moving functions which operate on data into the class which contains that data? For example, imagine you have a simple class which stores a description of the weather, with variables for temperature, humidity, wind speed and direction, and the time at which the measurement was taken. Now imagine you have an object of this class and you want to transmit it to someone else - another process, another machine, whatever. Do you put the code to transmit the object into the object itself - for example, by adding a Send(destination-type) method to the simple data class? Or do you keep this kind of feature in separate classes that can send and receive anything over the medium - whether it's networking, file i/o, interprocess comms, or anything similar?

My gut instinct is to keep my data classes simple and wrap them up when I want to transmit them - in classes which serialise them and present the sender and receiver classes with a simple interface they understand. The alternative seems to be to put everything including the kitchen sink into the simple data classes - every function which might ever operate on that data, however indirectly. In short, network error handling code doesn't seem to me to belong in a simple data class.

This seems obvious to me, but I keep seeing developers put Send() methods on their classes. They even tell message classes to Send() themselves, which seems highly counterintuitive to me; if I write a letter on a piece of paper, I don't tell the paper to send itself. I wrap the letter in an envelope and hand it to the postman, because he has a van and a map. What do people think?

+4  A: 

I've gone back and forth on this sort of design question several times in my career. Right now, I'm where you seem to be, mostly because I do a whole lot of SOA in my current life, and I end up writing a LOT of clases that exist only to be serialized into and out of various over-the-wire formats, mostly involving XML and JSON.

When you move into the "services" world, classes are often just representations of data that get sent back and forth. I generaly, therefore, split my classes into the two logical buckets, "classes that hold data" and "classes that do stuff". I don't know if I'm the only one doing this sort of thing, but that's where I'm at.

Mark
+5  A: 

There's logic to do with the payload item itself: what's the wind speed now?

There's logic to do with interpreting that data: can we dock this ship now?

There's logic to do with deciding to send the payload somewhere: oh here's a new weather value, someobody over there cares.

Then the actual network stuff.

The payload probably needs to be able to serialize and deserialise itself. I don't see that any of the rest is the payload's concern. There must be a better place for Send(). Not least because you might choose to send several payload objects at once, and they can't all send each other.

djna
+5  A: 

It's not a simple question. I have done projects using both approaches, and overall I've been happier working with the "smart model" approach, where the model knows a lot about how to do things with its own data.

One of the principles that leads to good encapsulation is "tell, don't ask" -- if you're telling the class to do something to itself, then no-one except the class itself needs to know the details of the class's internal representation. That's definitely a good thing. Also, I find that putting logic into the class itself often leads to easier code reuse -- when someone else goes to use that class, they will quickly discover that it already knows how to perform a given operation.

However, I don't want that to lead to breaking the boundaries between my application layers. I don't want a business object to know how to represent itself as HTML -- that's a presentation concern. So in this case, I would say that the class should know how to represent itself in some canonical way, but it shouldn't know about network stuff. The actual sending function should belong to a service.

JacobM
great post! +1.
Atømix
When you say, "the class should know how to represent itself in some canonical way", do you mean some OTHER canonical way that it's properties?
Mark
@Mark -- I'm not sure. It might make sense for a class to know how to serialize itself to XML, for example, but in the real world I've often found that I need slightly different XML representations for different situations, and that seems more like a presentation concern.
JacobM
@JacobM: Presentation indeed shouldn't be, in general, the concern of the class holding the data. If your XML-serialization is presentation-related, I'd consider putting it elsewhere...
Mark
A: 

The short answer is it depends on what downstream impacts you want to deal with.

In general when there's two way of doing something it usually means that both ways have their merit. Several examples spring to mind (SQL vs. NoSQL, Automatic vs Manual Transmission, Alternating vs Direct Current, Client Side vs Server Side, etc). The result of this is that your bound to get lots of people on both sides with opinions that have merit.

So the question you raise is when can an object manipulate its own data and when do I need to separate it out.

Personally I prefer to keep may data structures simple whose primary responsibility is to keep the data consistent. The responsibility of manipulating or using this data will be the responsibility of other classes. This has a tendency to help me separate policy and implementation. For example if I want to implement a caching policy I only have to visit the layer that gets the data and not the objects that store or manipulate the data.

On the other hand this does make the API harder to use since it not always obvious where stuff is. This also creates the likelihood that the same policy is created in multiple locations (and every layer ends up implementing some caching)

For example if the String methods like Split and Join and Substring weren't easy to find in the String class and instead where someplace else like for example a hypothetical Parse class its likely that before I found this hypothetical Parse class I would have written multiple crappy versions of those methods. A real life example of this is when people have written methods identical to those in the Math class because they don't know about it.

In the end if you don't want to deal with the downstream impact that changing the way the Send method works may require visiting a lot of classes, then move it outside of the classes.

If you don't want to deal with people accidentally implementing their own Send method and you don't want to reinforce it all the time then its better to put it inside the class.

Conrad Frix