views:

540

answers:

10

What are the purposes of having private/protected members of a class/structure in object-oriented programming? What's the harm in having all members be public?

+18  A: 

Encapsulation. I.e. hiding the implementation of your class data. This allows you to change it later, without breaking all client code. E.g. if you have

class MyClass {
    public int foo;
}

your clients may write code like

MyClass bar = new MyClass();
bar.foo++;

now if you realize that foo should actually be a double rather than int, you change it:

class MyClass {
    public double foo;
}

and the client code fails to compile :-(

With a well designed interface, the change of the internals (private parts) may even include turning a member variable into a calculation or vice versa:

class Person {
    public String getName();
    public String getStreetAddress();
    public String getZipCode();
    public String getCountryCode();
    public int hashCode();
}

(using String properties for the sake of simplicity - in a real world design some of these would probably deserve to have their own type.)

With this design, you are free to e.g. introduce an Address property internally, which would contain street address, zip code and country code, and rewrite your accessors to use the fields of this private member instead, without your clients noticing anything.

You could also decide freely whether to calculate the hash code every time, or to cache it into a private variable in order to improve performance. If that cache field was public, however, anyone could change it, which could ruin hash map behaviour and introduce subtle bugs. So encapsulation is key in guaranteeing the consistency of the your object's internal state. E.g. in the above example, your setters can easily validate the zip code and country code, to prevent setting invalid values. You can even ensure that the zip code format is valid for the actual country, that is, ensure a validity criteria spanning multiple properties. With a well designed interface, you can enforce this binding by e.g. providing only a setter to set both properties at the same time:

    public void setCountryCodeAndZip(String countryCode, String zipCode);

However, with public fields you simply don't have these choices.

A special use case for private fields is immutable objects; this is very common in e.g. Java, examples are String and BigDecimal. These classes have no public setters at all, which guarantees that their objects, once created, will not change their state. This enables a lot of performance optimizations, as well as makes them easier to use in e.g. multithreaded programs, ORM etc.

Péter Török
How might it break the code?
wrongusername
@wrongusername - changing the arguments or return type of a public method would break any code that might call that method, or any child class that overrides it. With private methods you know you can change them at will without having to worry that any other classes might be using them.
Eric Petroelje
@wrongusername: see my answer, I answered your comment while you were writing it
Lord Torgamus
Thanks for adding more content to your answer Peter!
wrongusername
+6  A: 

You may want to read the Information Hiding topic on wikipedia.

Essentially, private members allow a class to hide its implementation details from external consumers. This allows a class to better control how it data and behavior will be expressed, and allows the consumer to be ignorant of details that are not relevant to the primary purpose of the class.

Hiding implementation details improves the maintainability of a program by preventing code external from the class from establishing dependencies on those details. This allows the implementation to change independent of external consumers - with a reduced risk of breaking existing behavior. When private implementation details become public, they cannot be changed without the possibility of breaking consumers of the class that depend on those details.

Private members also allow a class to protect it's implementation from external abuse. Typically, the state of a class has internal dependencies which define when the state is valid - and when it is not. We can consider the rules that govern the validity of state information to be invariant - meaning that the class always expects them to be true. Exposing private details, allows external code to modify this state in a way that may violate the invariants, and therefore compromise the validity (and behavior) of the class.

An additional benefit of information hiding, is that it reduces the surface area that consumers of the class have to understand in order to properly interact with the class. Simplification is generally a good thing. It allows consumers to focus on understanding the public interface, and not how the class achieves its functionality.

LBushkin
+5  A: 

Nicely explained in Section 7.4: Protect your Private Parts of this online C++ tutorial.

Why bother with this stuff?

Specifiers allow a class to be very complex, with many member functions and data members, while having a simple public interface that other classes can use. A class which has two hundred data members and one hundred member functions can be very complicated to write; but if there are only three or four public member functions, and the rest are all private, it can be easy for someone to learn how to use the class. He only needs to understand how to use a small handful of public functions, and doesn't need to bother with the two hundred data members, because he's not allowed to access this data. He can only access the private data through the class' public interface. Without a doubt, in a small program, using these specifiers may seem unnecessary. However, they are worth understanding if you plan to do any program of reasonable size (more than a couple hundred lines). In general, it is good practice to make data members private. Member functions which must be called from outside the class should be public, and member functions which are only called from within the class (also known as "helper functions") should probably be private. These specifiers are especially useful in a large program involving more than one programmer.

The above explanation explains how using private eases the learning curve. Here's an example which explains the "code breaking" aspect:

Here's a class ParameterIO which reads and writes a vector of integer parameters

class ParameterIO
{
public:
    // Main member
    vector<int> *Params;
    string param_path;

    // Generate path
    void GeneratePath()
    {       
        char szPath[MAX_PATH];
        sprintf(szPath,"params_%d.dat",Params->size());
        param_path = szPath;
    }

    // Write to file
    void WriteParams()
    {
        assert_this(!Params->empty(),"Parameter vector is empty!");
        ofstream fout(param_path.c_str());
        assert_this(!fout.fail(),"Unable to open file for writing ...");
        copy(Params->begin(),Params->end(),ostream_iterator<int>(fout,"\n"));
        fout.close();
    }

    // Read parameters
    void ReadParams(const size_t Param_Size)
    {
        // Get the path
        Params->resize(Param_Size);
        GeneratePath();
        // Read
        ifstream fin(param_path.c_str());
        assert_this(!fin.fail(),"Unable to open file for reading ...");
        // Temporary integer
        for(size_t i = 0; i < Params->size() && !fin.eof() ; ++i) fin>>(*Params)[i];
        fin.close();
    }

    // Constructor
    ParameterIO(vector<int> * params):Params(params)
    {
        GeneratePath();
    }

    // Destructor
    ~ParameterIO()
    {
    }      

    // Assert
    void assert_this(const bool assertion, string msg)
    {
        if(assertion == false) 
        {
            cout<<msg<<endl;
            exit(1);
        }
    }
};

The following code breaks this class:

const size_t len = 20;
vector<int> dummy(len);
for(size_t i = 0; i < len; ++i) dummy[i] = static_cast<int>(i);
ParameterIO writer(&dummy);

// ParameterIO breaks here!
// param_path should be private because 
    // the design of ParameterIO requires a standardized path
writer.param_path = "my_cool_path.dat";
// Write parameters to custom path
writer.WriteParams();

vector<int> dunce;
ParameterIO reader(&dunce);
// There is no such file!
reader.ReadParams(len);
Jacob
@Jacob: nice answer, but what could you edit to specify what it is section 7.4 of?
Lord Torgamus
@Torgamus: Thanks for the feedback! Fixed that and threw in an example
Jacob
Good quote. But this made me laugh/cry: "A class which has two hundred data members and one hundred member functions...." Data hiding is the *least* problem of such a porcine class.
Wayne Conrad
+3  A: 

It really depends on your ideology. The idea is to hide information that shouldn't be exposed for some reason.

If you have a library you wish to publish online, lots of people will download it and some may use it in their code. If you keep your public API to a minimum and hide the implementation details, you'll have less of a hard time updating it when you encounter bugs or want to improve the code.

Also, in Java, for example, you have no way to restrict access to a member variable without changing its visibility, so you often find yourself prematurely creating getters and setters and making the variable itself private or protected. In Python, for example, that problem doesn't exist because you can make getters and setters behave like variables for direct access (they're called properties there).

Lastly, sometimes you need to have methods which require a consistent state to be useful and would lead to problems if accessed directly.

A rule of thumb is: if you expose something, someone will use it. And most often they'll use it for the wrong reasons (i.e. not how you intended them to be used). In this case Information Hiding is the equivalent of child locks on weapon cabinets.

Alan
+3  A: 

To add to Peter's answer, say your class stores a name, and you want to change it from using a single name string to a first name string and a surname string. If your members were public, other classes might read (or write) the name variable directly, and would break when that variable disappeared.

Not to mention that you might not want other classes to have the ability to edit your members at all.

Lord Torgamus
Thank you! I understand his and Eric's answers better now.
wrongusername
+3  A: 

Metaphorically, exposing private members as public is like having an option on your car's dashboard that allows you to tweak the engine oil pressure.

The car should manage that internally (privately), and the user should be shielded from messing with it directly (encapsulation), for obvious reasons.

Daniel Vassallo
+1  A: 

Sometime you don't want to reveal private information to everybody. E.g you don't want to let your age be known to public but you may want tell people if you are above 25.

fastcodejava
A: 

A short example: You may need to ensure certain conditions on that value. In this case, setting it directly may break such a condition.

Many people argument like "you may not want everybody to read it", but I think the constraint of setting a value is a more usable example.

Johann Philipp Strathausen
A: 

What are the purposes of having inner organs in the human body? What's the harm in having all organs outside?

Exactly!

The short answer would be: Because you need them so you can't live without them and you can't expose them to everyone to modify and play around with them because that could kill you (cause your class not to function properly).

Leo Jweda
A: 

No harm at all depending on the audience and consumption of the class. Let me repeat that one more time so it sinks in.

No harm at all depending on the audience and consumption of the class.

For many small projects of one or two people that take a month or so, getting all of the private and public definitions down perfectly might increase the work load substantially. On large, projects, however, where there may be multiple teams and teams are not geographically located together, getting the correct design of all of the public interfaces down at the start can greatly increase the likely hood of success of the project overall.

So you really have to look at how a class is going to be consumed and by whom before you can even begin to answer this question. Likewise, how long is the software development life-cycle going to be? Is it months? Years? Decades? Are there going to be other people besides you using the class?

The more "public" the class (ie the more people that will be consuming and using the class) the more important it is to nail down a solid public interface and stick to it.

Steven Noyes