Are RDF triple Stores, suitable for everyday programming?

+2 A:

There are no one-size-fits-it-all tools. Triple stores are appropriate and usable today for some kinds of tasks and not for others.

A similar question was asked on semanticoverflow.com and the common answer was the same: "use whatever is appropriate".

Pēteris Caune 2010-03-23 19:35:34

how many overflows are there? I think we need some buckets :)

WeNeedAnswers 2010-03-23 21:36:56

No One size fits all. I tend to agree with that, although the attitudes of RDBMS is that there is only one way to store data, and to retrieve it efficiently. RDBMS though tend to be inflexible due to the accidental constraints placed upon it from the coding perspective. RDF would never be constrained in the same manor, if used correctly, although the speed issues might be a problem on retrievals.

WeNeedAnswers 2010-03-24 11:51:04

+1 A:

Further to Peteris's answer there are some key differences between how you model data for a Triple Store vs other techniques like OOP, relational databases, XML e.g. rows, classes, properties etc

It very much depends what you want to do whether they are appropriate and whether you can find one with the right performance characteristics for your application.

People have a tendency to characterise triple-stores as being schema-less databases but realistically unless you are using some form of schema/ontology then they aren't particularly useful. If you want to use SPARQL to get stuff out then there needs to be some schema patterns in the store that you can write queries against.

Personally I would still use relational databases for a lot of things and still do, while I'm using RDF and triple stores for an increasing amount of stuff that doesn't mean I'm ready to throw out what works well.

As a final point even if you go with a relational database for the time being there are technologies like DB2RDF which can convert relational databases to RDF so you can stick with a DB for now and then export your database to RDF in the future as desired

RobV 2010-03-23 22:36:23

"Throwing out what works" - But isn't that what people are doing anyway, I mean, the increase of the ORM, Domains taking the place of ERD's and Entities. Why not look at the problem of the Impedance mismatch and grab it by the horns and choose option 3. I would always do a schema/domain Model for anything of a certain size. A lot of the RDF stuff though has been done already in such works as the Dublin Core and other well defined Schema's. You could come up with your own though, nothing stopping you. I probably would for a Domain solution.

WeNeedAnswers 2010-03-23 22:53:33

i wouldn't characterise ORM as a replacement for relational data models, it seems to be primarily used just to provide a high level abstraction layer for developers which appears to be a more general trend in modern development

RobV 2010-03-24 12:23:48

The ORM though gets embedded into the development language, then all your great work on Entities gets lost in the code. The ORM is certainly not modern, been around a long long time. The ORM is used today as a panacea to the Database Object problem not the higher idea of abstraction. I would offer up The triplets for an alternative approach, as you can still use the ORM on top of a triplet db.

WeNeedAnswers 2010-03-24 16:28:09

point taken, linq2rdf being an example of an ORM on top of a triplestore

RobV 2010-03-24 20:33:05

+1 A:

Query times tend to be much slower than for conventional DBs, even with simple queries. Also, many RDF stores don't support standard DB features like transactions, crash recovery, ...

Carsten 2010-03-25 02:35:06

Does that also go for the ones built on top of an RDBMS?

WeNeedAnswers 2010-03-25 02:37:32

Regarding query speed in my experience: Yes. DBs are very good at exploiting well designed DB schemas for queries. Unless a RDF store does a very clever, continuing analysis of the triples and maps them to a clever schema on the DB layer it will never come close. Jena SDB e.g. does some clever caching of strings, but basically puts everything in a few simple tables.I'd expect them to be better at crash recovery and I think I remember some of them supporting transactions.

Carsten 2010-03-25 23:01:16

Thanks for your help. Would you use one then for "everyday joe" programming, or only on real cases where they would come into their own?

WeNeedAnswers 2010-03-26 12:41:06

Unless you have queries which are much easier in SPARQL than SQL I would not bother. If your main motivation is that you are not sure about what schema to use, have you considered a No-SQL DB? I personally have no experience with them, but they seem to be fashionable at the moment ...

Carsten 2010-03-28 23:17:46

+1 A:

One of the shortcomings we have come across in using RDF triple stores for general programming is that most engines don't support aggregation in queries (min, max, group by).

A checklist we use to decide between RDBMS is the following

RDBMS if

static schema
very large amount of data
no RDF export needed
Lucene support needed (easy via Hibernate Search for example)
strong data consistency requirements (money involved etc)

RDF if

not fixed or dynamic schema
small to large amount of data
RDF export needed
loose data consistency requirements

Refactoring RDFBMS schemas for ongoing projects can be quite an overhead if you don't have the correct tools.

Lucene support is provided by some RDF engines as well, but is not as well documented and supported as in the case of Hibernate Search.

Scalability of RDF engines is also improving steadily, where ideas of the NoSQL side are incorporated into RDF engines, but if you go with the standard engines of Jena and Sesame, this division is still quite valid.

Timo Westkämper 2010-05-16 11:44:53

A reasonable checklist, but I'd add two amendments: (i) many RDF stores support Lucene indexes as well, and (ii) scalability of RDF stores is improving steadily, so I'd characterise the division as between "large" and "very large", not between "small" and "lots". Finally, though it's still in development, the next release of SPARQL will include aggregate functions. This in turn will drive the provision of aggregate functions in the RDF stores (noting that some already do). See http://www.w3.org/TR/sparql11-query/#aggregateFunctions.

Ian Dickinson 2010-05-19 13:27:18

Thanks for the pointers, I updated the answer

Timo Westkämper 2010-05-19 13:46:26

ansaurus

tags:

views:

answers:

Are RDF triple Stores, suitable for everyday programming?

related questions