views:

640

answers:

4

I need to keep a couple of [Jena] Models (OntModels, specifically) synchronized across a socket, and I'd like to do this one change at a time (for various reasons -- one being that each Statement added or removed from the OntModels is also adapting a JESS rule base.). I am able to listen to the add/remove events on the OntModels and then create simple event instances that wrap the added / removed Statements along with a ChangeType that indicates that the Statement was added or removed, but serializing the Statement has proven to be a problem.

Unfortunately, all of the JENA serialization documentation that I've found relates to serializing an entire model to xml / rdf / n3 / etc. Since statements are simply triples of Strings (at one level, anyway) it seems like it should be trivial to serialize the data at the Statement level. However, [Jena] doesn't seem to provide an API for creating Statements with plain strings that "does the right thing". Problems arise with typed literals. eg:

I can create the statement:

<http://someuri/myont#foo&gt; <http://someuri/myont#weight&gt; "50.7"^^www.w3.org/2001/XMLSchema#double

but the string version that I can get out looks like this:

"http://someuri/myont#foo" "http://someuri/myont#weight" "50.7^^www.w3.org/2001/XMLSchema#double"

(note the absence of a " before the ^^)

This wouldn't be that much of a problem, since the literal can still be parsed out with a regex, but I've been unable to create a Statement with the proper literal. The obvious approach (ModelCon.createStatement(Resource, Property, String)) generates an untyped string literal with the full value of the String passed in.

Does anyone know how I can reliably serialize (and deserialize, of course) individual Jena Statements?

+1  A: 

Not an area I've looked at in great depth, but I recalled Talis were doing some research and was able to follow breadcrumbs to the relevant vocabulary called "Changeset".

http://vocab.org/changeset/schema

I'm surprised you had issues serialising individual statements using JENA, but perhaps if you created a graph according to the changeset schema and serialised the graph you'd have more luck? Alternately, add the statement to a new graph and serialise a graph of one tripple.

Simon Gibbs
This may be the best approach -- if the need arises I'll probably consider keeping a model around just for serialization / deserialization purposes. (each change gets pushed to the model, the model is serialised, and then cleared for the next set of changes.)
rcreswick
+1  A: 

I would serialize the changes out in N-TRIPLES format. Jena has built-in N-TRIPLES serializer and parser, but the N-TRIPLES syntax is (deliberately) very simple so it would be easy to generate manually in your code.

However, it might be even easier to keep a plain mem model around to hold the changes, have the event handlers write changes into that model, then serialize that model over the wire according to your synchronization schedule. Likewise, at the far end I would read the updates from the sync channel into a temporary mem model, then yourOntModel.add( changesModel ) should add in the updates very straightforwardly.

Ian

Ian Dickinson
The N-triples syntax is esentially what I was using (and ended up using) the problem was in detecting literals during deserialization. (You can see my approach here now). Using models for serialization is probably the most robust though.
rcreswick
A: 

Maybe you should try to replace the String parameter of createStatement by Model.createLiteral(String)...

Vinze
Unfortunately I'm not *always* serializing literals.
rcreswick
A: 

The solution I ended up with is below. I ended up using a reg-ex approach because of time constraints (I didn't see the other suggestions on this question until very recently)

This is probably not the best approach, but it seems to work well (and I've vetted it with a test suite that exercise the use cases I need to deal with at this point).

The createStatement(...) method is in an OntUtilities helper class.

   /**
    * Serialization output method.
    * 
    * @param out
    * @throws IOException
    */
   private void writeObject(final ObjectOutputStream out) throws IOException {
     out.defaultWriteObject();
     out.writeObject(_statement.getSubject().getURI());
     out.writeObject(_statement.getPredicate().getURI());
     out.writeObject(_statement.getObject().toString());
   }

   /**
    * deserialization method.
    * 
    * @param in
    * @throws IOException
    * @throws ClassNotFoundException
    */
   private void readObject(final ObjectInputStream in) throws IOException, 
      ClassNotFoundException {
     in.defaultReadObject();

     final String subject = (String)in.readObject();
     final String predicate = (String)in.readObject();
     final String object = (String)in.readObject();

     _statement = OntUtilities.createStatement(subject, predicate, object);
   }

   /**
    * Creates a statement from a triple of strings.  These strings may be fully
    * qualified uris, or shortened "namespace" uris (eg: shai:TST)
    * 
    * @param sub The resource uri (the subject)
    * @param pred The predicate uri (the property)
    * @param ob The object uri.
    * @return A JENA Statement.
    */
   public static Statement createStatement(final String sub, final String pred,
         final String ob) {
      final Model m = ModelFactory.createDefaultModel();

      final String s = OntUtilities.nsUriToUri(sub);
      final String p = OntUtilities.nsUriToUri(pred);
      final String o = OntUtilities.nsUriToUri(ob);

      Statement stmt = null;
      try {
         // try making a uri as a syntax-verification step.
         new URI(o);
         // it was valid, so well use o as a resource:
         final Resource obj = m.createResource(o);
         stmt = m.createStatement(m.createResource(s), m.createProperty(p), obj);
      } catch (final URISyntaxException e) { 
         // o was *not* a uri.
         if (o.contains("^^")) {
            final int idx = o.lastIndexOf("^^");

            final String value = o.substring(0, idx);
            final String uri = o.substring(idx+2);

            final Literal lit = m.createTypedLiteral(value, getDataType(uri));

            stmt = m.createStatement(m.createResource(s), m.createProperty(p), lit);
         } else {
            // just use the string as-is:
            stmt = m.createStatement(m.createResource(s), m.createProperty(p), o);
         }
      }
      return stmt; 
   }
rcreswick