views:

255

answers:

4

I am trying to store a dependency tree in a PostgreSQL database. There are about 20,000 software items, each item can depend on several other items.

There are several types of dependencies (some are run-time dependencies, some are build-time dependencies and some are test-dependencies).

The dependency is recursive and each item only knows about the things it immediately depends on.

I'll need to list all the dependencies of an item and display them both as a tree and as a flattened list. I'll also need to answer "what depends on this item?"

What would be a recommended way to store this information to make fetching relatively easy?

A: 

I'd implement a simple many-to-many auto relationship.

Something like this:

 Software                Dependency
+------------+          +-----------------------+
| SoftwareId |          | SoftwareId            |
+------------+         /| DependsUponSoftwareId |
| Name       |--------|-+-----------------------+
| ...        |         \| ...                   |
+------------+          +-----------------------+ 
Paulo Santos
nice ascii-art ;-) but it shows one-to-many rather than many-to-many.
WildWezyr
The design will also require a one-to-one relationship from Dependency.DependsUponSoftwareId to Software.SoftwareId to ensure they only depend on existing rows.
Tony
A: 

I would use an ORM, construct the object graph in memory and then let the ORM to persist it :P.

Aggelos Mpimpoudis
have you considered performance issues with such a construction? how to model classes, why haven't you given examples?
WildWezyr
I think your comment is useless, instead, I have given food for thought at least and I didn't know that giving examples is always the -must-way. If you don't know how to persist a graph then I am really sorry. If you don't know how to model a tree then open another Question. If these are issues for you then you shouldn't comment at all.
Aggelos Mpimpoudis
+2  A: 

It might be worth picking up a copy of Joe Celko's "Trees and Hierarchies in SQL for Smarties". It has a explanations and examples of the different options available for this sort of thing.

Tony
I'd say that it's *definitely* worth getting this book. If you are implementing trees, it's worth reading. Period.
Nic Gibson
A: 

I'd store the data in something like

CREATE TABLE software (
   id SERIAL PRIMARY KEY,
   ...
);

CREATE TABLE software_dependency (
   dependent int NOT NULL REFERENCES software(id),
   dependee int NOT NULL REFERENCES software(id),
   deptype int, -- or whatever you want
   CONSTRAINT pk_software_dependency PRIMARY KEY  (dependent, dependee)
);

You'll be able to get a list of all the dependencies with something like:

WITH RECURSIVE t(id,type) AS (
 SELECT dependee,deptype FROM software_dependency WHERE dependent=3
UNION ALL
 SELECT d.dependee,deptype FROM software_dependency d INNER JOIN t ON t.id=d.dependent
)
SELECT * FROM t;

Edit: to get a tree, a good way is to accumulate the dependencies in ARRAY format. For each step in the recursion use the array append operator (||) to add the new ID to an array, and get it out at the end.

Magnus Hagander