views:

140

answers:

5

Apologies if this has been asked before, I'm not quite sure of the terminology or how to ask the question.

I'm wondering if there are libraries or best practices for implementing object models in C++. If I have a set of classes where instances of these classes can have relations to each other and can be accessed from each other via various methods, I want to pick a good set of underlying data structures to manage these instances and their inter-relationships. This is easy in Java since it handles the memory allocation and garbage collection for me, but in C++ I have to do that myself.

HTML's Document Object Model (DOM) is one example; as another (contrived) example, suppose I have these classes:

  • Entity
  • Person (subclass of Entity)
  • Couple (subclass of Entity)
  • Property
  • House (subclass of Property)
  • Pet (subclass of Property)
  • Car (subclass of Property)

and these relationships:

  • Entity
    • has 1 home of class House
    • has 0 or more pets of class Pet
    • has 0 or more cars of class Car
    • has 0 or more children of class Person
  • Person
    • has 0 or 1 spouse of class Person
    • has 0 or 1 marriage of class Couple
    • has 0 or 1 parents of class Entity (in this model parents don't exist if they're not alive!)
  • Couple
    • has 2 members of class Person
  • Property
    • has 1 owner of class Entity

Now that I've thought out these objects and their relationships, I want to start making data structures and methods and fields to handle them, and here's where I get lost, since I have to deal with memory allocation and lifetime management and all that stuff. You can run into problems like the following: I might want to put an object into a std::map or a std::vector, but if I do that, I can't store pointers to those objects since they can be relocated when the map or vector grows or shrinks.

One approach I used when I was working with COM a lot, is to have a hidden collection that contained everything. Each object in the collection had a unique ID (either a number or name), by which it could be looked up from the collection, and each object had a pointer to the collection. That way, if you have an object which wants to point to another object, instead of literally holding a pointer to another object, I store the ID and can look it up via the hidden collection. I can use reference-counting to automatically deal with lifetime issues (except for the case of disjoint cycles, sometimes that's not a problem).

Are there other approaches? Or are there libraries to make this kind of stuff easier in C++?

edit: then you have other issues, such as the relationships between objects are likely to be mutable in many cases, and you have to think ahead about how references to objects should be stored, and what methods should be provided for accessing objects from each other. For example, if I have a handle to a Person X, and I want to represent the concept of "find X's child named George", then I have to store the name "George" rather than a child number: the children may be stored in a vector, and I may be able to call X.getChildCount() and X.getChild(0), but "George" may not always be child number 0, since other children may be inserted before "George" in the child vector. Or X may have two or three or four other children also named "George". Or "George" may change his name to "Anthony" or "Georgina". In all these cases it is probably better to use some kind of unique immutable ID.

edit 2: (and I'll clean up my question a bit once I get this straightened out) I can deal with the choice of methods and property names, I can deal with whether to use a map or a list or a vector. That's fairly straightforward. The problems I'm trying to deal with specifically are:

  • how to have one object store a reference to another object, when those objects may be part of data structures that are reallocated
  • how to deal with object lifetime management, when there are reciprocal relationships between objects
+1  A: 

There are about a million (at a conservative estimate) approaches to this. You are really asking "how do I design software in C++". And the answer is, I'm afraid, "what is your software going to do?" - simply knowing that you want to deal with Persons and Houses is not enough.

anon
You have a point. Like I said, I'm groping around trying to find the right question to ask. My main concern is with memory management.
Jason S
The approach you take to memory management is intimately affected by the constraints on your software - i.e. what it does.
anon
A: 

Isn't this the whole point of OOP? That what you are asking is an implementation detail, that you hide behind the public interface of these classes, and therefore do not have to worry about, because you can change it without changing the interface? So go ahead, try it the way that you suggest. Then if there is a performance, memory or other problem, you can fix the implementation without without breaking the rest of your code.

It looks to me that storing your data in a database, and using some sort of object-relational mapping, might be another option to look at.

bobmcn
A: 

Jason, my favorite source on the is the C++ FAQ Book. The problem is you're effectively asking "how can I use C++ for object oriented programming?"

The best I can say in an SO answer is this:

all these things are going to be classes in C++, and the relationships etc will look a lot like garbage collected languages you're used to: if you need a relationship between a Person and his Child named "george", you pick a data structure that can store Persons or Childs indexed by name.

Memory management is actually easier than in straight C, if you follow some rules: make sure all the objects that need them have destructors, make sure the destructors clean up everything the object owns, and then make sure that you always put these dynamically constructed objects in contexts where they go out of scope when they're no longer needed. Those won't cover all cases, but they will save you from probably 80 percent of memory allocation mistakes.

Charlie Martin
A: 

You could use boost::shared_ptr to address the memory issues. You can then freely copy the shared_ptr around, return it from functions, use it as a local variable, etc.

A Person could then have a std::map< string, boost::shared_ptr<Person> >, so X.getChild("George") would simply look up the child in the map and return the pointer. I think you get the concept, so I'll leave the rest as an exercise to you ;)

tstenner
Using shared_ptr is almost guaranteed to create cycles. Parent points to child, child points to parent is a classic case.
Mark Ransom
exactly: that's the problem I'm trying to solve.
Jason S
+2  A: 

You wrote about storing objects from your object model inside std::vector etc. and problems with using pointers to them. That reminds me that it's good to divide your C++ classes into two categories (I'm not sure about terminology here):

  1. Entity classes which represent objects that are part of your object model. They are usually polymorphic or potentially will be in the future. They are created on heap and are always referenced by pointers or smart pointers. You never create them directly on stack, as class/struct members nor put them directly in containers like std::vectors. They don't have copy constructor nor operator= (you can make a new copy with some Clone method). You can compare them (their states) if it's meaningful to you but they are not interchangable because they have identity. Each two objects are distinct.

  2. Value classes which implement primitive user defined types (like strings, complex numbers, big numbers, smart pointers, handle wrappers, etc). They are created directly on stack or as class/struct members. They are copyable with copy constructor and operator=. They are not polymorphic (polymorhism and operator= don't work well together). You often put their copies inside stl containers. You rarely store pointers to them in independent locations. They are interchangeable. When two instances have the same value, you can treat them as the same. (The variables that contain them are different things though.)

There are many very good reasons to break above rules. But I observed that ignoring them from the start leads to programs that are unreadable, unreliable (especially when it comes to memory management) and hard to maintain.


Now back to your question.

If you want to store a data model with complex relationships and easy way to do queries like "find X's child named George", why not consider some in-memory relational database?

Notice that when you are going to efficiently implement a) more complex bidirectional relationships and b) queries based on different object properties, then you will probably need to create indexed data structures that are very similar to what relational database does inside. Are your implementations (as there will be many, in single project) really going to be more effective and robust?

Same goes for "collections of everything" and object ids. You'll need to track relationships between objects to avoid ids without objects anyway. How is it different from pointers? Other then getting meaningful errors instead of going crazy all over the memory, that is ;-)


Some ideas for memory management:

  • Strong ownership: when you can declare that some entity lives only as long as its owner and there is no possibility of independently existing pointers to it, you can just delete it in owner's destructor (or with scoped_ptr).

  • Someone already proposed smart_ptr. They are great and can be used with stl containers. They are reference couter based though, so do not create cycles :-(. I am not aware of any widely used c++ automatic pointers that can handle cycles.

  • Maybe there is some top-level object which owns all other objects. Eg. often you can say that all pieces belong to a document or an algorithm or a transaction. They can be created in context of top-level object and then deleted automatically when their top-level object is deleted (when you remove document from memory or finish execution of algorithm). Of course you cannot share pieces between top level objects.

Tomek Szpakowicz