tags:

views:

231

answers:

10

Hi All

In another community, i have people suggesting that you should 'NEVER' have attributes in your base class. Instead it should have only pure virtual functions and let the derived class take responsibility to provide definition for all these methods.

My take on it is that 'It is not a rule of thumb'. I gave following trivial example:

class Shape
{
    protected:
        double Area;
        double Perimeter ;
    public:
        Shape():Area(0.0), Perimeter(0.0){}
        double getArea(){return Area;}
        double getPerimeter (){return Perimeter;}
        virtual void setArea () = 0 ;
        virtual void setPerimeter () = 0 ;
};

class Rectangle : public Shape
{
    private :
        double length ;
        double breadth ;
    public:
        Rectangle ():length(0.0),breadth(0.0){}
        Rectangle (int _length, int _breadth):length(_length),breadth(_breadth){}
        void setArea ()
        {
            Area = length * breadth ;
        }
        void setPerimeter ()
        {
            Perimeter = 2 * (length + breadth) ;
        }
} ; 

With the example given above, i feel that any shape has following attributes 'Area' and 'Perimeter' and hence if we do not provide these attributes in the class, the Shape class would not be a appropriate representation of 'Real world shape'

Please let me know your opinion on this.

EDIT:

The first thing which i would say is indeed my code is a naive one and doesn't make much sense w.r.t. the design. I thought earlier that i would add a line saying that this was just an example and the whole point of this post is to know if 'Base class should NEVER have attributes'. But then i thought that i'll get some good design suggestions too and i indeed have got many :)

Coming to the question, with all the posts below, i understand that its not a rule of thumb and it is design choice (which is exactly i wanted to emphasize). Having said that i also acknowledge that it would be desirable to NOT have attributes in the base class if these attributes can be calculated (or derived) as Area = Length * Breadth. Thanks a lot for all your answers (i wish i had option to accept more than one answer).

+9  A: 

It's more than acceptable to have variables in a base class. A lot of languages don't allow this, though, so a base class can only have virtual functions. This is probably why you've heard that.

There are a lot of different views on the issue, but in C++, I think it's fairly common for base classes to store data.

dauphic
yes i think the people suggesting that are confusing interfaces with base classes.
tenfour
@dauphic: +1 Thanks for the precise answer.
hype
+6  A: 

Not having data in a base class would require unnecessary duplication of identical code in the derived classes. Preventing code duplication was one of the concepts that Object-oriented design was built in.

James Curran
A: 

To answer your question.... yes, use attributes in your base class where they are applicable across all subclasses. For example, if you know that a shape (POLYGON!) always has length and width, use these as attributes of the base class.

I don't see why you would want to recreate these for every implementation of a shape (POLYGON) you create...

...and I suppose if you're OK with calling the circumference of your circle its perimeter, you can use the shape class.

baultista
Circumference _is_ a perimeter. It is absolutely appropriate for a `Shape` base class to use "perimeter". Perimeter is a path around an area. Circumference is a circular perimeter. Similarly, a Square is a Rectangle with equal dimensions. `Circumference extends Perimeter` :P
Brian S
+1  A: 

I think the recomondation comes from stylistic issues. It is considered cleaner to have a kind of protocol (interface) class as base that only publishes the methods of a class. The implementation should be hidden from the classes clients as much as possible. If you also store data in the base class the implementing classes are not free to chose the best way to store the data. That way the base class restricts the implementation posiblities. Maybe one implementation would like to calculate an allready stored value so the space would be not needed but there is no way for the interitors to free a (not dynamic) allocated memory.

So as a resume I would say it depends if you "just code" a thing or if you would like to create a huge library for 10 years of maintenance that requires a cleaner approach.

schoetbi
+13  A: 

I disagree with your example. While of course a Rectangle has an area and a perimeter, it doesn't make much sense to store them separately from from the length and breadth, when you could just calculate them on the fly in the getters. Why force people to call these "Set" functions in order to get accurate results, when you can provide a much nicer interface?

I would make Shape a pure interface:

struct Shape
{
    virtual ~Shape(); // probably need this
    virtual double getArea () = 0 ;
    virtual double getPerimeter () = 0 ;
};

class Rectangle : public Shape
{
    double length ;
    double breadth ;
public:
    Rectangle ():length(0.0),breadth(0.0){}
    Rectangle (int _length, int _breadth):length(_length),breadth(_breadth){}
    double getArea ()
    {
        return length * breadth ;
    }
    double getPerimeter ()
    {
        return 2 * (length + breadth) ;
    }
};

i feel that any shape has following attributes 'Area' and 'Perimeter' and hence if we do not provide these attributes in the class, the Shape class would not be a appropriate representation of 'Real world shape'

I don't agree with this either. The Shape class should have a public API which provides access to its area and perimeter. Whether or not it has fields containing those values is an implementation detail, and has no bearing on whether it represents a shape to the outside world.

Providing the outside world with a nice interface is, say, 85% of class design. The other 15% is making sure that the interface that suits the caller is actually feasible to implement. Making the internal implementation details follow the same pattern as the external interface, to the extent that every property of the object returned by a function necessarily is stored in a field, isn't just low priority, it's pointless.

If you had a circle, you might conceivably have getRadius and getDiameter functions, but you wouldn't store them in different fields, despite the fact that of course circles have a radius and also a diameter.

Now, if this weren't a Rectangle, but some very complicated shape where calculating area and perimeter is slow, then you might want to cache the area and perimeter in the object, (and either recalculate them or mark the cache stale whenever the defining parameters change). But that's a particular property of complicated shapes, nothing to do with simple shapes like Rectangles, and so doesn't really belong in the base class Shape. Base classes should only contain things which are common to all derived classes, or at least common to enough of them that you're willing to act as though it's common to all (and override or ignore those things in the subclasses they don't apply to). In your example, the ability to report area and perimeter are common to all shapes. The fact of storing area and perimeter in data members need not be common to all shapes, and in the case of Rectangle occupies extra memory and requires extra code for pretty much no benefit.

you should 'NEVER' have attributes in your base class

I don't agree with this either, although I think I appreciate where they might be going with it.

Fields which a Shape base class might reasonably have include position co-ordinates and orientation (if we're talking about actual rectangles etc to be displayed on screen, rather than rectangles in the abstract), colour, tags (if we're firmly committed to crowd-sourced taxonomies, and probably inherited from another base class rather than defined in Shape itself), and that kind of thing, that will be handled the same way irrespective of what derived class is involved.

That said, inheritance of implementation is over-used (because it often appears misleadingly easy), but personally I think it's sometimes worthwhile. In the case of shapes, you can get into all kinds of trouble trying to define a class hierarchy of geometric figures.

It's not too bad if you stick to immutable objects. A Square is a Rectangle (whose width and breadth happen to be equal), so it can be a subclass of Rectangle. But then a month later you want to introduce generic Quadrilaterals. Where do you have to add the new class? Oops - at the top of the class hierarchy, where it potentially disturbs both your existing classes. Adding quadrilaterals has made you re-think squares. All else being equal you'd like that kind of wide-ranging change to be optional, not force it on yourself.

What about mutable objects, though? A mutable Square is not a mutable Rectangle, because mutable rectangles should provide the ability to set the width to 2 and the height to 3. Squares should not provide this ability. Neither is a mutable Rectangle a mutable Square (since mutable Squares should have the property that if you change the width, the height changes. Mutable Rectangles should not have this property). So mutable rectangles whose width and height happen to be equal, aren't the same thing as mutable squares at all. The classes can't be related by inheritance. Now, should Rectangle(2,2) == Square(2) be true or false? Whichever we decide, we're going to surprise somebody. So we find we have to leave them as unrelated, incomparable classes. That is to say, don't use inheritance.

So pure interfaces are good where possible, and some people have decided that these kinds of problem are so common that other kinds of inheritance just aren't worth it. The Go programming language does not have inheritance. As an aside, many people have also decided to avoid mutable objects, and done quite well out of it, but that's not the question....

I think there are plenty of cases where, as an implementation detail, it is convenient to have common functionality in base classes. Just don't think this common behaviour is truly intrinsic to your base class. It's implementation, not interface, and some day you might define a derived class that doesn't want to use any of it, because some other implementation is more appropriate. Where you can conveniently put the common behaviour in embedded objects rather than base classes, do so.

Steve Jessop
+1 ! Inheritance does not follow the Single Responsibility property since it's used both for implementation and interface... that's why it's messed up I think and I rather like the Go idea there (and their duck-typing). Haskell's pretty nice on that point too.
Matthieu M.
Accepting this answer for the extra bit of information and explanation from different perspectives.
hype
+1  A: 

The users of a class can extend it in ways you never thought about. For example, imagine a rectangle with no top (just the other 3 sides). I'd call that a shape, even though it'd make no sense to talk about its area, and even its perimeter might be ill-defined. Why should i have to include code to set properties that make no sense? And what's more, what happens to code that expects those properties to mean something?

cHao
In that case, I'd make a `Curve` base class. While it's true that curves and lines are technically shapes, most people only refer to closed spaces as 'shape'.
Brian S
"Most" isn't all, and like you said, i'm not wrong. A base class should only have stuff in it that makes sense for all subclasses, not just the ones you think there should be. Til a Shape is *defined* as being a closed space, "perimeter" and "area" don't make sense for them and shouldn't need to be there.
cHao
+1  A: 

You need to separate interface from implementation. Yes, a shape has both area and perimeter, and yes you should provide a way to query those values, but that's it. You should not tell a shape how to to do this.

Note in your example you've broken invariants; a rectangle of a certain length and width can have its area publically changed from under its feet. (Because it has not yet been recalculated) A mess! Only define the interface to the attributes:

class shape
{
    public:
        ~shape(){}

        virtual double area(void) const = 0;
        virtual double perimeter(void) const = 0;
};

And let the shapes return that information in their own way:

class square : public shape
{
public:
    square(double pSide) : mSide
    {}

    double area(void) const
    {
        return mSide * mSide;
    }

    double perimeter(void) const
    {
        return 4 * mSide;
    }

private:
    double mSide;
};

However, let's say those things were indeed expensive to calculate, and you wanted to cache those attributes after a shape-specific property changed. You might introduce a mixin, which is just a helper base class. This particular once uses the CRTP:

template <typename Shape>
class shape_attribute_cache : public shape
{
public:
    // provide shape interface
    double area(void) const
    {
        return cached_area();
    }

    double perimeter(void) const
    {
        return cached_perimeter();
    }

protected:
    shape_attribute_cache() : mArea(0), mPerim(0), mChanged(false) {} 
    ~shape_attribute_cache(){} // not to be used publically 

    double cached_area(void) const { update_area(); return mArea; }
    void cached_area(double pArea) { mArea = pArea; }

    double cached_perimeter(void) const { update_perimeter(); return mPerim; }
    void cached_perimeter(double pPerim) { mPerim = pPerim; }

    void changed(void) { mAreaChanged = true; mPerimChanged = true; }

private:
    void update_area(void)
    {
        if (!mAreaChanged) return;

        // delegate to derived shapes update method
        static_cast<Shape*>(this)->update_area();
        mAreaChanged = false;
    }

   void update_perimeter(void)
    {
        if (!mPerimChanged) return;

        // delegate to derived shapes update method
        static_cast<Shape*>(this)->update_perimeter();
        mPerimChanged = false;
    }

    double mArea;
    double mPerim;
    bool mAreaChanged;
    bool mPerimChanged;
};

// use helper implementation
class expensive_shape : public shape_attribute_cache<expensive_shape>
{
public:
    // ...

    void some_attribute(double pX)
    {
        mX = pX;
        changed(); // flag cache is bad, no longer callers responsibility
    }

private:
    // ...

    void update_area(void)
    {
        double newArea = /* complicated formula */;
        cached_area(newArea);
    }

    void update_perimeter(void)
    {
        double newPerim = /* complicated formula */;
        cached_perimeter(newPerim);
    }
}; 

A shape remains purely an interface, but you've provided a helper shape interface which may be used but you are not forcing this implementation upon other shapes.

This is all in this instance, perhaps it was just a poor example. You should do what you need to get a clean design, and if that means introducing variables into the base class, then do so. (Standard library streams do this, and then derived classes change virtual functions without changing the "real" functions; I do this in some of my code.)

GMan
+1  A: 

It's fine to have data in the base class. In the interests of data-hiding and OO-design though it should almost never be public or protected (which is just a fancy way of saying "anyone can mess up my data"). For example I'd rework your sample as something like this:

class Shape
{
    private:
        double Area;
        double Perimeter ;
    public:
        Shape(double the_area, double the_perimeter): Area(the_area), Perimeter(the_perimeter){}
        double getArea() const {return Area;}
        double getPerimeter () const {return Perimeter;}
};

class Rectangle : public Shape
{
    private :
        double length ;
        double breadth ;
    public:
        Rectangle (int _length, int _breadth):Shape(length * breadth, 2 * (length + breadth)), length(_length),breadth(_breadth){}
} ; 
Mark B
A: 

The people in that other community of yours seem to be confusing interfaces with base classes or abstract classes. An interface is what they describe as a list of pure virtual functions, however, this isn't the same as an interface.

You could say that an interface defines what a class should be able to do (i.e. which publically accessible functions it has to have), whereas a 'base class' is a partial implementation of that, usually in cases where its subclasses have the same implementation of a certain function or need to store the same information.

Take your example. The Shape class could be an interface, defining just methods to get the area and perimeter which all Shapes (= classes that implement Shape) have to have.

After that, you need to define two shapes: a Rectangle and a Square. A Square has just one property, width, which is also its height. A Rectangle has both a Width and a Height. Now, you could implement your Rectangle as a separate class that has a Height and Width, but then you'd have two classes with a Width - Square and Rectangle. Instead of that, you can subclass Square, add the Height, and you have a Rectangle.

Probably not the best of examples, but eh.

Cthulhu
+2  A: 

Generally speaking, whenever someone tells you to NEVER do something, smart money says he's either green or oversimplifying the issue.

Your purpose as a programmer is to find the simplest way to solve a problem (this often is not the same as the easiest way!). If you have to take big structural pains to satisfy a "best practice" design principle, there's a good chance it's not best practice in that particular case. Moreover, performance considerations may give implementations that a CS professor would scoff at an edge, although these should be well documented and applied judiciously.

As for the particular advice...software design largely concerns itself with interfaces. The way your class is used by other code should be as clean as possible, and it should be designed so that implementation details can be changed without requiring changes in the client code.

Read that sentence twice to let it sink in, it's very, very important. It means, among other things, that whoever uses your code should not have to care whether your base class has attributes or not. Public data members are usually a bad idea because they can't be swapped with calculations if future requirements demand it, but if your class has a member function to set/get the value, it makes no difference to client code whether it just sets/gets a data member or does something more sophisticated.

So, if your use case demands data members in the base class, go ahead. The only really important thing is to make the interface flexible enough to handle future additions that make these members a silly idea. Also, if that happens, you can move the original base class down a level in the inheritance tree and keep the common code for old leaf classes. For example,

class base {
public:
  base(int x);
  virtual ~base();

  inline int x() const { return x_; }

  // virtual interface declared here

private:
  int x_;
};

class derived : public base { ... };

may become

class base {
public:
  virtual ~base();
  virtual int x() const = 0;

  // ...
};

class base_with_static_x : public base {
public:
  base_with_static_x(int);
  virtual ~base_with_static_x();

  inline int x() const { return x_; }

  // ...

private:
  int x_;
};

class derived : public base_with_static_x { ... };

There's really very little sense in writing the same code over and over again.

Anonymous Coward