views:

508

answers:

4

The implementing-result-paging-in-hibernate-getting-total-number-of-rows question trigger another question for me, about some implementation concern:

Now you know you have to reuse part of the HQL query to do the count, how to reuse efficiently?

The differences between the two HQL queries are:

  1. the selection is count(?), instead of the pojo or property (or list of)
  2. the fetches should not happen, so some tables should not be joined
  3. the order by should disappear

Is there other differences?

Do you have coding best-practices to achieve this reuse efficiently (concerns: effort, clarity, performance)?

Example for a simple HQL query:

    select       a     from A a join fetch a.b b where a.id=66 order by a.name
    select count(a.id) from A a                  where a.id=66


UPDATED

I received answers on:

  • using Criteria (but we use HQL mostly)
  • manipulating the String query (but everybody agrees it seems complicated and not very safe)
  • wrapping the query, relying on database optimization (but there is a feeling that this is not safe)

I was hoping someone would give options along another path, more related to String concatenation.
Could we build both HQL queries using common parts?

+2  A: 

Well, I'm not sure this is a best-practice, but is my-practice :)

If I have as query something like:

select A.f1,A.f2,A.f3 from A, B where A.f2=B.f2 order by A.f1, B.f3

And I just want to know how many results will get, I execute:

select count(*) from ( select A.f1, ... order by A.f1, B.f3 )

And then get the result as an Integer, without mapping results in a POJO.

Parse your query for remove some parts, like 'order by' is very complicated. A good RDBMS will optimize your query for you.

Good question.

Sinuhe
Thanks for your support :-) My typical query is HQL, either returning pojos or a list of properties.
KLE
Like you, I don't feel like parsing the query to remove some parts ; but I don't really trust the RDBMS to optimize the query in all cases. I think some cases get messed up, and it's hard to predict which. **Are there facts list about these optimizations?**
KLE
I don't know where to find these fact lists or if this information is public. I can't be completely sure that a RDBMS optimize this way. But as student, I saw some "old and basic" optimizations that seem more difficult than this one. In this case e.g. RDBMS would think: "They are querying me for a number of rows, so the order is avoidable".
Sinuhe
+2  A: 

Have you tried making your intentions clear to Hibernate by setting a projection on your (SQL?)Criteria? I've mostly been using Criteria, so I'm not sure how applicable this is to your case, but I've been using

getSession().createCriteria(persistentClass).
setProjection(Projections.rowCount()).uniqueResult()

and letting Hibernate figure out the caching / reusing / smart stuff by itself.. Not really sure how much smart stuff it actually does though.. Anyone care to comment on this?

Tim
Hibernate doesn't cache queries by itself; you have to do so explicitly. The problem with the above approach (and using Criteria in general) is that layer assembling the criteria has to create another copy of it just for counting. In other words, I can't just create a Criteria in the business layer, pass it to service (or DAO) and get back 1 page of results + total count. Not a huge deal for small apps but leads to a LOT of unnecessary code in bigger ones.
ChssPly76
A: 

In a freehand HQL situation I would use something like this but this is not reusable as it is quite specific for the given entities

Integer count = (Integer) session.createQuery("select count(*) from ....").uniqueResult();

Do this once and adjust starting number accordingly till you page through.

For criteria though I use a sample like this

final Criteria criteria = session.createCriteria(clazz);  
            List<Criterion> restrictions = factory.assemble(command.getFilter());
            for (Criterion restriction : restrictions)
                criteria.add(restriction);
            criteria.add(Restrictions.conjunction());
            if(this.projections != null)
                criteria.setProjection(factory.loadProjections(this.projections));
            criteria.addOrder(command.getDir().equals("ASC")?Order.asc(command.getSort()):Order.desc(command.getSort()));
            ScrollableResults scrollable = criteria.scroll(ScrollMode.SCROLL_INSENSITIVE);
            if(scrollable.last()){//returns true if there is a resultset
                genericDTO.setTotalCount(scrollable.getRowNumber() + 1);
                criteria.setFirstResult(command.getStart())
                        .setMaxResults(command.getLimit());
                genericDTO.setLineItems(Collections.unmodifiableList(criteria.list()));
            }
            scrollable.close();
            return genericDTO;

But this does the count every time by calling ScrollableResults:last().

non sequitor
+1  A: 

Nice question. Here's what I've done in the past (many things you've mentioned already):

  1. Check whether SELECT clause is present.
    1. If it's not, add select count(*)
    2. Otherwise check whether it has DISTINCT or aggregate functions in it. If you're using ANTLR to parse your query, it's possible to work around those but it's quite involved. You're likely better off just wrapping the whole thing with select count(*) from ().
  2. Remove fetch all properties
  3. Remove fetch from joins if you're parsing HQL as string. If you're truly parsing the query with ANTLR you can remove left join entirely; it's rather messy to check all possible references.
  4. Remove order by
  5. Depending on what you've done in 1.2 you'll need to remove / adjust group by / having.

The above applies to HQL, naturally. For Criteria queries you're quite limited with what you can do because it doesn't lend itself to manipulation easily. If you're using some sort of a wrapper layer on top of Criteria, you will end up with equivalent of (limited) subset of ANTLR parsing results and could apply most of the above in that case.

Since you'd normally hold on to offset of your current page and the total count, I usually run the actual query with given limit / offset first and only run the count(*) query if number of results returns is more or equal to limit AND offset is zero (in all other cases I've either run the count(*) before or I've got all the results back anyway). This is an optimistic approach with regards to concurrent modifications, of course.

Update (on hand-assembling HQL)

I don't particularly like that approach. When mapped as named query, HQL has the advantage of build-time error checking (well, run-time technically, because SessionFactory has to be built although that's usually done during integration testing anyway). When generated at runtime it fails at runtime :-) Doing performance optimizations isn't exactly easy either.

Same reasoning applies to Criteria, of course, but it's a bit harder to screw up due to well-defined API as opposed to string concatenation. Building two HQL queries in parallel (paged one and "global count" one) also leads to code duplication (and potentially more bugs) or forces you to write some kind of wrapper layer on top to do it for you. Both ways are far from ideal. And if you need to do this from client code (as in over API), the problem gets even worse.

I've actually pondered quite a bit on this issue. Search API from Hibernate-Generic-DAO seems like a reasonable compromise; there are more details in my answer to the above linked question.

ChssPly76
+1 Thanks for these precisions on manipulating the query. Thanks also for the excellent precision that **a count query should be run only after a first query**.
KLE
I updated my question, would you make another answer related to the new part? I liked many of your others posts, and you are a java expert :-) ...
KLE
Thanks :-) I've updated my answer above.
ChssPly76