I'd like to learn more about RDF/SPARQL implementation internals, but most of the frameworks are (necessarily) somewhat complicated by real-world performance and implementation consideratins. I am curious if there is a "reference" implementation that would be suitable as a low-level teaching tool? What is the RDF/SPARQL implementation that is the smallest/cleanest from a code standpoint?
I have not seen mention of an official reference implementation.
But maybe this will help... have you taken a look at the "SPARQL Query Language Implementation Report"? It compares 14 SPARQL implementations against a common test suite.
- Make an effort to read RDF Primer - it is really easy.
- SPARQL Query Language is pretty easy to understand.
- Google for "SPARQL endpoint" to play with SPARQL. You will find a few (like 1 or 2 or 3 or 4).
- I have some resources at my wiki page - you may start from it.
- There are good books on Semantic Web available, ask me for more information.
Don't be afraid to start. Take any RDF engine, define a task and program it! I recommend you to start with Sesame.
Finding a small and clean SPARQL implementation is going to be hard since the language is quite complex and expressive and most implementations (my own included) add a variety of extensions to the syntax as demanded by customers/perceived usage scenarios.
AFAIK Jena's documentation provides the most comprehensive description of how a SPARQL implementation actually functions but like you say it's rather complex.
In terms of just understanding and teaching SPARQL getting your head around the SPARQL Algebra is very important. If you understand the algebra you can work out by hand how a query should translate into algebra and then work through executing it by hand - obviously I don't recommend trying this for anything other than relatively simple queries on very small datasets!
Another key thing to teach is that the language is not procedural, an implementation is free to reorder and adjust the query in any way it sees fit provided this does not change the actual meaning of the query.