views:

230

answers:

4

Is there a tool that can let one browse relational data as a graph of connected nodes?

For example, i'm faced with trying to cleanse some anomolous data. i can start with two offending rows. In this particular example, the TransactionID should, by business rules, be unique to the table, but i find a transaction that violates that rule:

SELECT * FROM LCTTrans
WHERE TransactionID = 1075048

LCTID      TransactionID
=========  =============
4358       1075048
4359       1075048

2 row(s) affected

But really what i want to begin to hunt down all the related data, to try to see which is right. So this hypothetical software would start by showing me these two rows:

alt text

Next, i want to see that transaction that is linked into this table:

alt text

Now that transaction points to an MAL, so show me that:

alt text

Now lets add those two LCTs, that the transaction is "on". A transaction can be on only one LCT, yet this one is pointing to two:

alt text

Okay computer, both of those LCTs point to an MAL and the transaction that created them, show me those:

alt text

Those last two transactions, they also point at an MAL, and they themselves point to an LCT, show me those:

alt text

Okay, now are there any entries in LCTTrans that point to LCTs 4358 or 4359?...

And so on, and so on.

Now i did all this manually, running single selects, copying and pasting uniqueidentifier keys and converting them into friendly id numbers so i could easily see the relationships.


Is there software that can do this?

A: 

I can think of a few for relational data (RDF, Topic Map and conceptual graph browsers), but none off-hand for SQL. You could try and translate your queries to a relational language the browsers understand. You also might be able to build something on top of skyrails. Most of the visualisations I've tagged on delicious are for graph or relational data, but again tend to be schema free rather than SQL.

Pete Kirkham
A: 

Basically you write a dedup tool where you show both records onthe screen side by side with the ability to pick the record you wan to keep but to check individual data from the other record to keep as well. Since deduping is very differnt from database to database and highly dependant on the specific table structure and business rules you have (as well as knowledge about which things must be looked at for the type of deduping you are doing as they typically only show the most important relationship tables on screen), I have never seen one that wasn't built in house.

But if you want a quick look at all the data write a query that left joins to all the child tables and shows all the fields for both transactionids. Then read through your results.

More importantly, how did you end up with a dup if you hav ea business rule that requires the transactionid to be uninique. Did you forget that all of these types of rules must be enfoced through the datbase and not the application? Why was there no unique index on that field?

HLGEM
i could write a tool that can be used to explore these two duplicates - but that doesn't help me in other situations. Situations, say, where i'm investigating a terrorist, finding everything he did. Or perhaps exploring a possible relationship between two customers.
Ian Boyd
A: 

I've looked for open source software that can do this sort of link analysis, without much success. If you have enough of a budget to go proprietary, you might consider talking to Palantir Technologies, Centrifuge Systems, i2, etc. about analytics platforms and visualization technologies.

estanford
+1  A: 

Ok, well I liked this idea so much that I've written it.

It's not released yet, but when it is it will be free.

Edit ->

Ok, it's now released. Free relational database exploring goodness @ http://www.atlantis-interactive.co.uk/products/datasurf/default.aspx

Matt Whitfield