views:

440

answers:

4

I'm interested in understanding why people choose to leverage technology built on the W3C Semantic Web standards. How did you make the decision to use the technology you chose and what does it enable you to do that you identify as being unique to the Semantic Web effort?

If you have specifically made the decision not to use Semantic Web technologies what are your reasons for that decision? What would have to change to cause you to revisit your decision?

Note: I know there are many different types of semantic computing technologies besides those specified by the W3C but please keep answers to this subjective question focused on the W3C technologies (RDF, OWL, SPARQL).

Disclaimer: I focus on semantic computing incubation's at Microsoft.

+3  A: 

I am presently working on a project involving large structured data sets, such as DBPedia, Freebase, IMDB, Musicbrainz, etc. I need to be able to manage all that data in a transparent manner, I need to be able to merge the different data sets where there are duplicates, and I need to be able to query the data in a logical fashion.

I originally attempted to devise my own implementation using standard SQL-compatible databases, but I struggled to build a robust schema that also provided the response times I needed. I eventually gave up and started using semantic web tools, and I haven't looked back. My reasons are as follows:

  • Mature frameworks. Projects such as Jena have been in development for several years and thus have proven themselves in the field.
  • Open data sets. Two of the sets I listed above (DBPedia and Musicbrainz) publish their data as RDF triples. Most other structured data sets can easily be converted to RDF. This makes data management trivially simple, and no complex parsing code is needed.
  • Solid query language. SPARQL is a powerful query mechanism for RDF data, and it is better tailored for more open data than SQL is.
  • Community. There is a substantial community of supporters for the semantic web, which is continually driving the effort forward. For example, the DBPedia project I've mentioned is attempting to interlink its data with other data sources out there, making the previously disconnected databases interoperable.
toluju
Thanks for the thoughtful response. In your project are you aggregating the data together and then querying it or are you federating queries across all of the data sets?
spoon16
Currently I'm federating queries across the data sets, but the longer term goal is to aggregate the data together. As I mentioned some of that work is already done for me thanks to DBPedia's integration efforts.
toluju
Without strong keys, I think you'll find that aggregation is probably the hardest step of all.
Peter Burns
+1  A: 

The w3c collects use cases from organizations who have successfully built and deployed applications backed by semantic web technologies. You can find a listing of them at the use case page, there is a lot of good content there on a variety of systems using various amounts of semantic goodness.

The use cases are pretty good in describing what the application is, how much semweb stuff is used, and the motivations for building the application in the first place, and what using semweb technologies added to the equation.

Michael Grove
+2  A: 

I'm currently involved in a project where we try to leverage semantic technologies and data (mainly umbel)

So far I have found the current SPARQL, RDF, RDFS, OWL and SKOS stack to be cumbersome at best. I don't know how this approach will be able to break through. Criticism of RDF is not new, I think the entire approach is significantly bloated, even when considering its lightweight N3 notation. When dealing with let me say, Umbel, I have to read not only into RDF, but its schema, then into OWL, and on top of that I have to get into SKOS. After that, I'm at what I'm interested in, Umbel, which has its own definitions on top of the entire stack I mentioned.

So I have a sc:Project which is of type umbel:SubjectConcept which is of type Skos:Concept which is an instance of owl:Class which is a subclass of rdf:Class.

It has properties calles skos:broaderTransitive, which has subproperties called skos:broader, all of them are instances of owl:ObjectProperty, which is a subclass of rdf:Property, which hell, is an instance of rdfs:Class

etc. These are the simpler examples...

kitsune
+1  A: 

Semantic technologies are just like any other technologies have good and bad sides. However, I found it rather odd for semantic technologies that its good sides are rarely explicitly expressed, but are often substituted with bloated, pink, unfulfillable dreams.

1) I personally find RDF profitable (vs. in XML) in data integration scenarios. The main benefit comes from the fact the RDF documents are syntax independent. Consider the following integration task: Merge two (or more) data document: In case of XML, you make a common schema, write XSLT which accomplishes both syntactic and semantic integration. For RDF documents: just concatenate the RDF triples and syntactic integration is already done. For semantic integration, you will still need rules/rule engine or adopted SPARQL queries.

What if business change? In the case of XML, you adopt your XSLT-s and you if needed your common schema (watch out for backward compatibility). In case of RDF, you still just concatenate the triples.

2) How would you handle different versions of persistent business objects? SQL databases are schema dependent. If your business objects change, you need to update the db schema. What do you do with your legacy (but high value) data?

  • hire an army of monkeys, to convert from old version to new version
  • write a script for conversion (if possible)
  • extend your SQL tables with new columns filling them with NULL-s.
  • maintain dual versions

But what if a third, fourth, fifth version comes along? How long can you stretch?

RDF databases are schema independent. That means that you can store/query any type of business objects without changing anything (that is cheap!). Though your SPARQL queries probably needs to be updated (depending what solution you choose).

Facilitating CHANGE (e.g., in integration scenarios) can be significantly CHEAPER with RDF than XML/SQL

Downsides of RDF: This is easy and well-published, but here are a few:

  • Terrible RDF/XML syntax
  • No buy-in from biggest players like SUN/Microsoft
  • No comparable alternative to the power of XSLT in the semantic world. SWRL does not even get close.
  • Hard, steep learning curve for developers
  • Little understood open world assumption model.
  • Insufficient educational material, books, articles.
  • Used to have shortage of developer tools, but it is getting better now. I still do not know a good library for e.g., Python.
  • No native support for sorted sets (arrays), you need to use linked list.
  • Semantic web got a bad reputation from the bloated boastful promises. No, it won't solve world hunger.
ROWLEX Admin