views:

53

answers:

2

Hi everyone,

I need to load very large ontology represented as N-triples file(1gb) to the openrdf Sesame application. I'm using the workbench interface to do that. I know that this file is too big to be loaded in one request. To get around that, I splitted my files in files of size 100mb. But I still get a error form the openrdf Sesame server :

HTTP ERROR 500

Problem accessing /openrdf-workbench/repositories/business/add. Reason:

    Unbuffered entity enclosing request can not be repeated.
Caused by:

org.apache.commons.httpclient.ProtocolException: Unbuffered entity enclosing request can not be repeated.
 at org.apache.commons.httpclient.methods.EntityEnclosingMethod.writeRequestBody(EntityEnclosingMethod.java:487)

Has anyone a good knowledge of openrdf Sesame or other ontology manager that I could use for my task ?

Thanks a lot for your input

K.

A: 

I don't know exactly what task you hope to achieve, but you may want to check out here for a list of scalable triple stores with informal (mainly self-claimed) scalability results. In this, Sesame only reports handling 70M statements (not so many... might be the cause of your troubles.)

badroit
A: 

@badroit, that list is badly out-of-date IMHO, and the reported number for Sesame is as well. It's capable of handling of hundreds of millions of triples (and if you count OWLIM as a Sesame store, billions).

Back to the original question: the workbench is really not the ideal tool for these kinds of tasks - although I would expect it to be able to cope with 100MB files. It might be that the Tomcat on which you run Sesame has a POST limit set? You could ask around on Sesame's mailinglist, there's quite few knowledgeable people there as well. But here are two possible ideas to get things done:

One way to handle this is to do your upload programmatically, using Sesame's Repository API. Have a look at the Sesame user documentation for code examples.

Alternatively, if you are using a Sesame native store, you could do a 'dirty' workaround using Sesame's command line console: create a local native triple store and upload your data to that local store (this should be much quicker because no HTTP communication is necessary). Then, shut down your Sesame server, copy the datafiles of the local native store over the store data files in your server, and restart.

Jeen Broekstra