tags:

views:

121

answers:

1

Hi I'm trying to use a RDF library for c# called SemWeb but i'm not beeing able to read my rdf file, wich is 570 MB large because i can't seem to implement any of the examples they give and the documentation is a little poor.

Does anyone use this library or another to parse RDF files?

I need it urgently since i'm doing an university job that requires a parser to RDF.

Thanks

+1  A: 

To read a file using SemWeb is very simple, just use something like the following:

MemoryStore mem = new MemoryStore();
mem.Import(new N3Reader("file.ttl"));

//Iterate over and print statements
foreach (Statement stmt in mem)
{
    Console.WriteLine(stmt.ToString());
}

If your file is RDF/XML then you'd use the RdfXmlReader class instead.

Alternatively you could use my library dotNetRDF to read your file:

Graph g = new Graph();
FileLoader.Load(g, "file.ttl");

//Iterate over and print Triples
foreach (Triple t in g.Triples)
{
    Console.WriteLine(t.ToString());
}

Only problem is that if your file is RDF/XML the parser in my library won't handle files of that size currently. If your file is NTriples/Turtle/N3 then you shouldn't have an issue but be prepared to wait for a couple of minutes (for example an ~90MB 1 million triple dataset for the Berlin SPARQL Benchmark takes ~4 minutes to parse but is somewhat dependant on your machine)

This may actually be an issue in general, I'm not sure how the RDF/XML parser in SemWeb is implemented so it may have similar issues to my own with very large files.

Note

Whether this is the best approach for reading your file may depend on what you then intend to do with the data once parsed. There may be more efficient ways to read in/process your data in both SemWeb and dotNetRDF depending on what you intend to then do with that data.

RobV
HiThanks for helping me i'm going to give a try on dotNetRDF but wirh SemWeb i don't understand what are statements and using memorystore wil not try to load all of my file into memory?
Elias
Statement is just another name for Triple, it's the class name that SemWeb uses for Triples. MemoryStore is an in-memory store for your data and is equivalent to the Graph class in dotNetRDF, either way you'll be loading all your data into memory
RobV
Hi, i'm trying to understand the use of dotNetRDF but i have a doubt. What if i don't want to parse the "text" file but want to put it in a relational database, how can i do it? I saw on your site you use a Virtuoso server, but i've never worker with it, can i use sql server?
Elias