views:

931

answers:

2

Hi all. I'm using Hibernate Entity Manager 3.4.0.GA with Spring 2.5.6 and MySql 5.1. I have a use case where an entity called Artifact has a reflexive many-to-many relation with itself, and the join table is quite large (1 million lines). As a result, the HQL query performed by one of the methods in my DAO takes a long time. Any advice on how to optimize this and still use HQL ? Or do I have no choice but to switch to a native SQL query that would perform a join between the table ARTIFACT and the join table ARTIFACT_DEPENDENCIES ?

Here is the problematic query performed in the DAO :

@SuppressWarnings("unchecked")
   public List<Artifact> findDependentArtifacts(Artifact artifact) {
      Query query = em.createQuery("select a from Artifact a where :artifact in elements(a.dependencies)");
      query.setParameter("artifact", artifact);
      List<Artifact> list = query.getResultList();
      return list;
   }

And the code for the Artifact entity :

package com.acme.dependencytool.persistence.model;

import java.util.ArrayList;
import java.util.List;

import javax.persistence.CascadeType;
import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.FetchType;
import javax.persistence.GeneratedValue;
import javax.persistence.Id;
import javax.persistence.JoinColumn;
import javax.persistence.JoinTable;
import javax.persistence.ManyToMany;
import javax.persistence.Table;
import javax.persistence.UniqueConstraint;

@Entity
@Table(name = "ARTIFACT", uniqueConstraints={@UniqueConstraint(columnNames={"GROUP_ID", "ARTIFACT_ID", "VERSION"})})
public class Artifact {

   @Id
   @GeneratedValue
   @Column(name = "ID")
   private Long id = null;

   @Column(name = "GROUP_ID", length = 255, nullable = false)
   private String groupId;

   @Column(name = "ARTIFACT_ID", length = 255, nullable = false)
   private String artifactId;

   @Column(name = "VERSION", length = 255, nullable = false)
   private String version;

   @ManyToMany(cascade=CascadeType.ALL, fetch=FetchType.EAGER)
   @JoinTable(
         name="ARTIFACT_DEPENDENCIES",
         joinColumns = @JoinColumn(name="ARTIFACT_ID", referencedColumnName="ID"),
         inverseJoinColumns = @JoinColumn(name="DEPENDENCY_ID", referencedColumnName="ID")
   )
   private List<Artifact> dependencies = new ArrayList<Artifact>();

   public Long getId() {
      return id;
   }

   public void setId(Long id) {
      this.id = id;
   }

   public String getGroupId() {
      return groupId;
   }

   public void setGroupId(String groupId) {
      this.groupId = groupId;
   }

   public String getArtifactId() {
      return artifactId;
   }

   public void setArtifactId(String artifactId) {
      this.artifactId = artifactId;
   }

   public String getVersion() {
      return version;
   }

   public void setVersion(String version) {
      this.version = version;
   }

   public List<Artifact> getDependencies() {
      return dependencies;
   }

   public void setDependencies(List<Artifact> dependencies) {
      this.dependencies = dependencies;
   }
}

Thanks in advance.

EDIT 1 : The DDLs are generated automatically by Hibernate EntityMananger based on the JPA annotations in the Artifact entity. I have no explicit control on the automaticaly-generated join table, and the JPA annotations don't let me explicitly set an index on a column of a table that does not correspond to an actual Entity (in the JPA sense). So I guess the indexing of table ARTIFACT_DEPENDENCIES is left to the DB, MySQL in my case, which apparently uses a composite index based on both clumns but doesn't index the column that is most relevant in my query (DEPENDENCY_ID).

mysql> describe ARTIFACT_DEPENDENCIES;
+---------------+------------+------+-----+---------+-------+
| Field         | Type       | Null | Key | Default | Extra |
+---------------+------------+------+-----+---------+-------+
| ARTIFACT_ID   | bigint(20) | NO   | MUL | NULL    |       |
| DEPENDENCY_ID | bigint(20) | NO   | MUL | NULL    |       |
+---------------+------------+------+-----+---------+-------+

EDIT 2 : When turning on showSql in the Hibernate session, I see many occurences of the same type of SQL query, as below :

select dependenci0_.ARTIFACT_ID as ARTIFACT1_1_, dependenci0_.DEPENDENCY_ID as DEPENDENCY2_1_, artifact1_.ID as ID1_0_, artifact1_.ARTIFACT_ID as ARTIFACT2_1_0_, artifact1_.GROUP_ID as GROUP3_1_0_, artifact1_.VERSION as VERSION1_0_ from ARTIFACT_DEPENDENCIES dependenci0_ left outer join ARTIFACT artifact1_ on dependenci0_.DEPENDENCY_ID=artifact1_.ID where dependenci0_.ARTIFACT_ID=?

Here's what EXPLAIN in MySql says about this type of query :

mysql> explain select dependenci0_.ARTIFACT_ID as ARTIFACT1_1_, dependenci0_.DEPENDENCY_ID as DEPENDENCY2_1_, artifact1_.ID as ID1_0_, artifact1_.ARTIFACT_ID as ARTIFACT2_1_0_, artifact1_.GROUP_ID as GROUP3_1_0_, artifact1_.VERSION as VERSION1_0_ from ARTIFACT_DEPENDENCIES dependenci0_ left outer join ARTIFACT artifact1_ on dependenci0_.DEPENDENCY_ID=artifact1_.ID where dependenci0_.ARTIFACT_ID=1;
+----+-------------+--------------+--------+-------------------+-------------------+---------+---------------------------------------------+------+-------+
| id | select_type | table        | type   | possible_keys     | key               | key_len | ref                                         | rows | Extra |
+----+-------------+--------------+--------+-------------------+-------------------+---------+---------------------------------------------+------+-------+
|  1 | SIMPLE      | dependenci0_ | ref    | FKEA2DE763364D466 | FKEA2DE763364D466 | 8       | const                                       |  159 |       |
|  1 | SIMPLE      | artifact1_   | eq_ref | PRIMARY           | PRIMARY           | 8       | dependencytooldb.dependenci0_.DEPENDENCY_ID |    1 |       |
+----+-------------+--------------+--------+-------------------+-------------------+---------+---------------------------------------------+------+-------+

EDIT 3 : I tried setting the FetchType to LAZY in the JoinTable annotation, but I then get the following exception :

Hibernate: select artifact0_.ID as ID1_, artifact0_.ARTIFACT_ID as ARTIFACT2_1_, artifact0_.GROUP_ID as GROUP3_1_, artifact0_.VERSION as VERSION1_ from ARTIFACT artifact0_ where artifact0_.GROUP_ID=? and artifact0_.ARTIFACT_ID=?
51545 [btpool0-2] ERROR org.hibernate.LazyInitializationException - failed to lazily initialize a collection of role: com.acme.dependencytool.persistence.model.Artifact.dependencies, no session or session was closed
org.hibernate.LazyInitializationException: failed to lazily initialize a collection of role: com.acme.dependencytool.persistence.model.Artifact.dependencies, no session or session was closed
    at org.hibernate.collection.AbstractPersistentCollection.throwLazyInitializationException(AbstractPersistentCollection.java:380)
    at org.hibernate.collection.AbstractPersistentCollection.throwLazyInitializationExceptionIfNotConnected(AbstractPersistentCollection.java:372)
    at org.hibernate.collection.AbstractPersistentCollection.readSize(AbstractPersistentCollection.java:119)
    at org.hibernate.collection.PersistentBag.size(PersistentBag.java:248)
    at com.acme.dependencytool.server.DependencyToolServiceImpl.createArtifactViewBean(DependencyToolServiceImpl.java:93)
    at com.acme.dependencytool.server.DependencyToolServiceImpl.createArtifactViewBean(DependencyToolServiceImpl.java:109)
    at com.acme.dependencytool.server.DependencyToolServiceImpl.search(DependencyToolServiceImpl.java:48)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at com.google.gwt.user.server.rpc.RPC.invokeAndEncodeResponse(RPC.java:527)
    at com.google.gwt.user.server.rpc.RemoteServiceServlet.processCall(RemoteServiceServlet.java:166)
    at com.google.gwt.user.server.rpc.RemoteServiceServlet.doPost(RemoteServiceServlet.java:86)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:637)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
    at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:487)
    at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:362)
    at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
    at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
    at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:729)
    at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
    at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
    at org.mortbay.jetty.handler.RequestLogHandler.handle(RequestLogHandler.java:49)
    at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
    at org.mortbay.jetty.Server.handle(Server.java:324)
    at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:505)
    at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:843)
    at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:647)
    at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:205)
    at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:380)
    at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:395)
    at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:488)
A: 

Check that indices on your join table are defined.

Dan
The DDLs are automaticaly generated by Hibernate, see my edit above at the end of my question (for soem reason I couldn't manage to post it here in a comment).
Fabien
Exactly my point. You say "So I guess the indexing of table ARTIFACT_DEPENDENCIES is left to the DB...". But it doesn't happen magically, see CREATE INDEX statement: http://dev.mysql.com/doc/refman/5.1/en/create-index.html and issue appropriate commands to the DB.
Dan
A: 

As Dan correctly points out, a missing index or foreign key to the join table is the most likely problem.

If proper foreign keys don't help, take a look at the SQL generated by Hibernate (with the showSql configuration switch; also see Spring config sample below), and run it through EXPLAIN to see if your indexes are used properly.

<bean id="entityManagerFactory"
    class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
    <property name="jpaVendorAdapter">
        <bean class="org.springframework.orm.jpa.vendor.HibernateJpaVendorAdapter">
            <property name="showSql" value="true" />
        </bean>
</bean>

Edit: This is the query you're seeing often, reformatted:

select a.* from ARTIFACT_DEPENDENCIES d 
    left outer join ARTIFACT a on d.DEPENDENCY_ID=a.ID 
        where d.ARTIFACT_ID=?

This looks exactly like the query that I would expect from a single execution of your findDependentArtifacts method.

Maybe fetch=FetchType.EAGER is the problem, as you're always immediately loading all the dependencies of all the artifacts, and then their dependencies, and so on, which could mean that you're loading the full dependency graph without intention. What happens if you switch to fetch=FetchType.LAZY?

Henning
I get a great number of occurences of the same type of query. I can't ost everything here so I've edited my question, see Edit 1 and 2 above in the question.
Fabien
I tried setting the fetch type to LAZY, but I get a LazyInitializationException (see EDIT 3 in my question)
Fabien