I'm seeing performance problems with retrieving multiple instances of objects that have many relationships with other objects. I'm using Spring and Hibernate's JPA implementation with MySQL. The issue is that when executing a JPA query, Hibernate does not automatically join to other tables. This results in n*r + 1 SQL queries, where n is the number of objects being retrieved and r is the number of relationships.
Example, a Person lives at an Address, has many Hobbies, and has visited many Countries:
@Entity
public class Person {
@Id public Integer personId;
public String name;
@ManyToOne public Address address;
@ManyToMany public Set<Hobby> hobbies;
@ManyToMany public Set<Country> countriesVisited;
}
When I perform a JPA query to get all Persons named Bob, and there are 100 Bobs in the database:
SELECT p FROM Person p WHERE p.name='Bob'
Hibernate translates this to 301 SQL queries:
SELECT ... FROM Person WHERE name='Bob'
SELECT ... FROM Address WHERE personId=1
SELECT ... FROM Address WHERE personId=2
...
SELECT ... FROM Hobby WHERE personId=1
SELECT ... FROM Hobby WHERE personId=2
...
SELECT ... FROM Country WHERE personId=1
SELECT ... FROM Country WHERE personId=2
...
According to the Hibernate FAQ (here and here), the solution is to specify LEFT JOIN or LEFT OUTER JOIN (for many-to-many) in the query. So now my query looks like:
SELECT p, a, h, c FROM Person p
LEFT JOIN p.address a LEFT OUTER JOIN p.hobbies h LEFT OUTER JOIN p.countriesVisited c
WHERE p.name = 'Bob'
This works, but there appears to be a bug if there's more than one LEFT OUTER JOIN in which case Hibernate is incorrectly looking for a non-existent column:
could not read column value from result set: personId69_2_; Column 'personId69_2_' not found.
The bug behavior appears to be possibly addressed by Hibernate Core bug HHH-3636. Unfortunately the fix is not part of any released Hibernate JAR. I've ran my application against the snapshot build but the bug behavior is still present. I've also built my own Hibernate Core JAR from the latest code in the repository and the bug behavior is still present. So maybe HHH-3636 doesn't address this.
This Hibernate performance limitation is very frustrating. If I query for 1000 objects then 1000*r + 1 SQL queries are made to the database. In my case I have 8 relationships so I get 8001 SQL queries, which results in horrible performance. The official Hibernate solution to this is to left join all relationships. But this isn't possible with more than one many-to-many relationships due to the bug behavior. So I'm stuck with left joins for many-to-one relationships and n*r+1 queries due to the many-to-many relationships. I plan to submit the LEFT OUTER JOIN problem as a Hibernate bug, but in the meantime my customer needs an app that has reasonable performance. I currently use a combination of batch fetch (BatchSize), ehcache and custom in-memory caching but the performance is still pretty poor (it improved retrieving 5000 objects from 30 to 8 seconds). The bottom line is that too many SQL queries are hitting the database.
So, my questions, is it possible to use Hibernate in performance sensitive applications where tables have multiple relationships with each other? I would love to hear how successful Hibernate uses address performance. Should I be hand writing SQL (which somewhat defeats the purpose of using Hibernate)? Should I de-normalize my database schema to reduce the number of joined tables? Should I not be using Hibernate if I need fast query performance? Is there something faster?