tags:

views:

75

answers:

2

What is the difference between DISTINCT and REDUCED in SPARQL?

+2  A: 

REDUCED is like a 'best effort' DISTINCT. Whereas DISTINCT guarantees no duplicated results, REDUCED may eliminate some, all, or no duplicates.

What's the point? Well DISTINCT can be expensive; REDUCED can do the straightforward de-duplication work (e.g. remove immediately repeated results) without having to remember every row. In many applications that's good enough.

Having said that I've never used REDUCE, I've never seen anyone use REDUCED, and never seen REDUCED mentioned in a talk or tutorial.

Just found this: http://www.franz.com/agraph/support/documentation/current/twinql-tutorial.html#header3-92 says - If you do not need duplicates to be removed, but you do not need the redundant entries, either — which would be the case if you are relying on counts to be correct, for example — then you can specify REDUCED instead of DISTINCT. **This allows AllegroGraph to discard duplicate values if it's advantageous to do so.**
Tomalak
+1  A: 

In my mind (and in my own SPARQL implementation) REDUCED is effectively an optional DISTINCT constraint which is only applied if the engine deems it to be necessary i.e. the query engine will decide whether or not to eliminate duplicate results based on the query

In my own implementation I only eliminate duplicates when REDUCED has been used if OFFSET/LIMIT has also been used

RobV