views:

44

answers:

3

My application loads lots of data from a database into a complex data structure. The in-memory data structure ressembles the structure of the database, which means that if the database contains the following tables:

  • table A, key is A1
  • table B, key is B1, one of the columns is a foreign key to [the key of] table A
  • table C, key is C1, one of the columns is a foreign key to [the key of] table B

Then I have classes A, B and C, and:

  • a data member of B (B::m_a) is a pointer to A
  • a data member of C (C::m_b) is a pointer to B

This implies that if I load the database, that I have to load it in the correct order. If I first load C, then it will complain that it cannot set the value C::m_b because the instance where it should point to was not loaded.

Problem is when there is also a column in A that is a foreign key to one of the other tables, let's say C.

I could solve the problem by loading all foreign keys as strings, and then perform a lookup after all the data has been loaded, but since I sometimes have to load millions of records, I can't afford to spend memory on these (albeit temporary) strings.

Having read about good design (e.g. the book "Large Scale C++ Software Design") it seems to me that it's a bad idea to have circular references at all. E.g. if file X.H includes Y.H, but Y.H also includes X.H you probably have a bad design; if class X depends on class Y and vice versa you probably have a bad design, which should be solved by extracting this dependency and introducing a third class Z, which depends on X and Y (X and Y won't depend on eachother anymore).

Is it a good idea to also extend this design-rule to database design? In other words: preventing circular references in foreign keys.

+1  A: 

Yes, cyclical dependencies in databases are a good excuse to rethink the design.

Nathon
+2  A: 

The only time you should need a circular reference is when you are creating a hierarchical structure, such as an organizational tree.

Table Employees
   EmployeeID   <----------|
   SupervisorEmployeeID ---|
Robert Harvey
A: 

From a data modelling perspective there is nothing fundamentally "wrong" with a circualr dependency. It doesn't mean the model is wrong.

Unfortunately most SQL DBMSs cannot effectively implement such constraints because they don't support multiple table updates. Usually the only way around this is to suspend one or more constraints temporarily (for instance using a "deferrable" foreign key or similar features) or by altering the model to make some part of the constraint optional (putting one of the referencing columns into a new table). This is just a workaround for a nasty limitation of SQL however, it doesn't mean you did anything wrong to start with.

dportas