views:

34

answers:

1

i'm not sure of the term for this problem i'm having, or if it's even an issue to worry about at all. let's say i have a hypothetical situation like this:

http://i.imgur.com/o1zyq.png

it seems as though having the link from remix objects back to the the original objects makes for a somewhat complicated structure, especially if i start to add more objects into the structure.

if i remove the links from remix song and remix album to the original, i can use some sort of ID and traverse the structure to still figure out the original versions, but this would require me to write some code to ensure the integrity of the data, like the remix album is not pointing to an original album that no longer exist.

question: is having a structure like this something to worry about? if so, how to fix such a structure aside from the solution i proposed above which requires writing code to ensure the integrity of the data.

+2  A: 

I don't know what programming language you're working with, but it looks to me like you're describing a directed acyclic graph, which, in simple terms, is a collection of points with arrows connecting them, but there aren't any cycles.

This is a very common structure. For instance, it describes dependencies of software packages in operating systems with automated software installation (such as many Linux distributions). It describes citations in research papers, where a paper can cite many other papers and a paper can be cited by many other papers, but it doesn't make sense for two papers to cite each other.

The best way to represent this data structure depends on the programming language, and on what you need to do with it. The simplest way to do it in most programming languages is to simply have each object link to the other objects by reference, something like:

struct Song {
    std::string name;
    std::vector<struct Foo*> originals;
};

It's a simple matter to find every "original" of a given song, but it's more expensive to find every "remix". You could augment the structure with remix links and ensure consistency, but in both cases, you have to ensure there are no cycles.

In a SQL database, you could describe the relationship like so:

CREATE TABLE songs (
    id SERIAL PRIMARY KEY,
    name TEXT
);

CREATE TABLE is_remix_of (
    remix    INT REFERENCES songs(id),
    original INT REFERENCES songs(id)
);

CREATE INDEX remix_to_original ON is_remix_of(remix);
CREATE INDEX original_to_remix ON is_remix_of(original);

Again, you would have to find a way to guard against cycles.

Joey Adams