+2  A: 

The setMaxResults does not work with outer join SQL queries. Maybe this is your problem: Hibernate does not return distinct results for a query with outer join fetching enabled for a collection (even if I use the distinct keyword)?.

Thomas Jung
+3  A: 

What is happening here can be seen very clearly by turning on SQL debugging in Hibernate and comparing the generated queries.

Using a fairly simple SaleItem one-to-many mapping (which is hopefully self-explanatory), a Criteria-based query like this:

Criteria c = sessionFactory.getCurrentSession().createCriteria(Sale.class);
c.createAlias("items", "i");
c.add(Restrictions.eq("i.name", "doll"));
c.setResultTransformer(Criteria.DISTINCT_ROOT_ENTITY);
c.setMaxResults(2);

produces SQL like this:

select top ? this_.saleId as saleId1_1_, ... 
from Sale this_ 
inner join Sale_Item items3_ on this_.saleId=items3_.Sale_saleId 
inner join Item items1_ on items3_.items_id=items1_.id 
where items1_.name=?

whereas a Query like this:

Query q = sessionFactory.getCurrentSession().createQuery("select distinct s from Sale s join s.items as i where i.name=:name");
q.setParameter("name", "doll");
q.setMaxResults(2);

produces something like:

select top ? distinct hibernated0_.saleId as saleId1_ 
from Sale hibernated0_ 
inner join Sale_Item items1_ on hibernated0_.saleId=items1_.Sale_saleId 
inner join Item hibernated2_ on items1_.items_id=hibernated2_.id 
where hibernated2_.name=?

Note the difference in the very first line (DISTINCT). A ResultTransformer like DISTINCT_ROOT_ENTITY is a Java class, which processes the results of the SQL rows after the SQL is executed. Therefore, when you specify a maxResults, that will be applied as a row limit on the SQL; the SQL includes a join onto the elements in the Collection, so you're limiting your SQL result to 90 sub-elements. Once the DISTINCT_ROOT_ENTITY transformer is applied, that may result in less than 20 root elements, purely dependent on which root elements happen to come out first in the 90 joined results.

DISTINCT in HQL behaves very differently, in that that actually uses the SQL DISTINCT keyword, which is applied before the row limit. Therefore, this behaves as you expect, and explains the difference between the 2.

In theory you should be looking at setProjection to apply a projection at the SQL level -- something like c.setProjection(Projections.distinct(Projections.rootEntity())) -- but unfortunately Projections.rootEntity() doesn't exist, I just made it up. Perhaps it should!

Cowan
Is there any known (efficient) workaround for this problem? This clearly makes the Criteria API unusable for us :(
Kim L
Not a good one, that I'm aware of. If you're not dealing with a large amount of raw data, you can always select them all, use `DISTINCT_ROOT_ENTITY`, and then take the first _n_ items of the resluting `List`. No, not scalable, and yes, rather yuck. Unfortunately don't have any better suggestions -- unless anyone else knows any other possibilities, the Criteria API may not be for you in this particular case.
Cowan
That's the problem, queries with dynamically injected criterions, lots of data.. *sigh*
Kim L
For anyone looking for an answer to this question, I made a workaround by using two queries. First I had my normal criteria with all the restrictions and limitations I wanted to have but with an added projection criteria.setProjection(Projections.distinct(Projections.id())); With this criteria I got a list of ids of the pojos I wanted to fetch. After this I created another criteria with the Restrictions.in("id",ids); The second criteria contained the DISTINCT_ROOT_ENTITY, but not the setMaxResult (as it was applied for the first query).
Kim L
Nice. Using 2 queries is a bit icky but that's about as good as you're going to get. Good solution.
Cowan