views:

120

answers:

3

I am doing (want to do) some experiments with Linked Open Datasets particularly those put out by governments.

I have a RDBMS (more specifically MySQL). I designed it with semantic web ideas in mind i.e. I have a information stored as objects, predicates and classes which define objects. In turn all objects are related to each other though statements of the form subject --> predicate --> object (where the subjects are from the objects table).

I want to be able to query other RDF triple stores from my application and let other triple stores query my data. Is it possible to "set something up" so that this is possible?

I have looked at Jena. Using Jena seems to mean I have to it as a storage application rather than MySQL - the only problem with this is that I include a new concept called a category (which I don't think is part of the semantic web languages). I will use categories to help with displaying information (they don't have any other meaning) but using Jena seems to mean that I can't organise predicates under categories for more convenient viewing.

I am using Java so a JAVA API is preferred.

It's also possible I misunderstood the purpose of Jena, and maybe that can be of use, but I am not sure how.

I am sure four days from now this question will seem rather silly, but at the moment I am somewhat confused about how to proceed.

+2  A: 

Why didn't you just use a triple store to store all of your data? If you use a triple store with SPARQL endpoint capability then you would have a SPARQL-accessible web api. Similarly, many other data sets on the web are exposed as SPARQL endpoints and accessible via HTTP.

There are many triple stores available with persistent storage both in a db and otherwise (Jena + SDB, Mulgara, Virtuoso, Oracle, etc). You could certainly extend Mulgara through their resolvers to support queries against your custom db but I think that's probably a lot of work for not too much real value.

I'm sure you could use existing concepts to handle your notion of categories in RDF or perhaps by layering something over Jena.

Alex Miller
Yes, I am leaning towards Jena and SDB - not sure what will happen to categories however.
Ankur
+1  A: 

I'm not sure what you mean by "a new concept called category", perhaps you can give an example?

If you mean that you want to add additional metadata, perhaps as a way of organizing information in the user interface, there is no need to extend the semantic web languages or storage systems - they can already do what you want.

Suppose you have data for a school from the UK Government schools dataset (using Turtle encoding for brevity):

@prefix sch-ont:  <http://education.data.gov.uk/def/school/&gt;.
<http://education.data.gov.uk/id/school/135412&gt;
a sch-ont:School;
sch-ont:establishmentStatus 
    <http://education.data.gov.uk/def/school/EstablishmentStatus_Open&gt;;
sch-ont:MSOA <http://statistics.data.gov.uk/id/msoa/E02000001&gt;;
sch-ont:establishmentName "Guildhall School of Music and Drama";
...

You can directly query that data from the SPARQL end-point, or you can download the data and store it locally in your own triple store. Either way, you're perfectly at liberty to add extra information that's useful to your users. For example:

@prefix ankurs-app: <http://ankur.org/example/app/vocab/display#&gt;.
<http://education.data.gov.uk/id/school/135412&gt; 
        ankurs-app:category ankurs-app:wkdCool.

You can store this new triple in the same graph as the downloaded data, or you can store it in a separate named-graph to indicate that it's information that has a different provenance than the source data. Either way, it's then simple to query it either programmatically from Jena, or via a SPARQL query.

Doing a layout for efficiently querying schemaless triple-centric data is a well-studied, and hard, problem. Most of the RDF platforms, including Jena, have well-optimised code for querying and updating triples from their own database schemes. You would have to have very good reasons for embarking on your own relational table layout :)

If you really do need to take an existing relational table scheme and map it to a Jena RDF model, look at D2RQ.

Ian Dickinson
A: 

Note you can use Virtuoso to directly map relational data to RDF and make them instantly accessible as Linked Data as detailed at:

Automated Generation of RDF Views over Relational Data Sources

Virtuoso Linked Data Views for the MySQL sample Database

hwilliams