views:

360

answers:

1

Java Persistence with Hibernate shows lots of examples of how to eagerly fetch associated entities, such as:

  • Adding @org.hibernate.annotations.BatchSize to the associated class
  • Adding @org.hibernate.annotations.Fetch to the field that references the associated class
  • Using the "fetch" keyword in the HQL query, etc...

However in my case, I am dealing with a slow-running process that is responsible for building associations to the entity class of interest. That means that - at the time of execution - I can't query one entity and ask it to eagerly fetch all the associated instances of the other entity since no such association exists.

In other words, the process looks something like this:

public class InitConfig {

    private final SessionFactory sessionFactory;
    private final NodeManager nodeManager;

    public void run() {

        final Configuration config = new Configuration("NewConfiguration", new HashSet<Node> ());
        for(String name : lotsOfNames) {

            //Lots of these queries run slowly
            final Node node = this.nodeManager.getNode(name);
            config.addNode(node);
        }
        this.sessionFactory.getCurrentSession().save(config);
    }
}

The related DAO (NodeManager) and slow-querying Entity (Node) look like this:

public class NodeManager {

    private final SessionFactory sessionFactory;

    public Node getNode(String name) {

        final Session db = this.sessionFactory.getCurrentSession();
        final Query query = db.createQuery("from NODE where NAME = :name");
        query.setString("name", name);
        return (Node)query.uniqueResult();
    }
}

@Entity(name="NODE")
public class Node {

    @GeneratedValue(strategy=GenerationType.TABLE)
    @Id @Column(name="ID")
    private Long id;

    private @Column(name="NAME", nullable=false, unique=true) String name;

    //Other properties and associations...
}

And finally, the entity being created by the slow-running process:

@Entity(name="CONFIGURATION")
public class Configuration {

    @Id @GeneratedValue @Column(name="ID")
    private Long id;
    private @Column(name="NAME", nullable=false, unique=true) String name;

    @ManyToMany
    @JoinTable(
        name="CONFIGURATION_NODE",
        joinColumns=@JoinColumn(name="CONFIGURATION_ID", nullable=false),
        inverseJoinColumns=@JoinColumn(name="NODE_ID", nullable=false))
    private Set<Node> nodes = new HashSet<Node> ();

    public void addNode(Node node) {

        this.nodes.add(node);
    }
}

My question is this: How do I modify the Hibernate configuration and/or code to eagerly fetch many instances of Node at a time?

Follow on questions would be:

  • Is Hibernate's 2nd level cache appropriate for this, and if so - how do I configure it?
  • If not, is there some other Hibernate feature that can be used here?

There are roughly 100,000 Nodes at the moment, so I'm reluctant to take the brute-force approach of querying every single Node and caching it in the application somewhere, since that would not scale to higher numbers of Nodes and seems like it would duplicate data between the application and Hibernate's internals (Session, 2nd level cache, etc...).

+1  A: 

I'm not 100% clear on what you're trying to do here... Does Configuration.addNode() (or some other method you've not shown) involve some business logic?

If it does not, you'll likely be much better off running several batch update queries to associate your nodes with configuration.

If it does (meaning you have to load and inspect each node), you can:

  1. Split your entire lotsOfNames list into batches of, say, 1000 names (you can play with this number to see what gives optimal performance).
  2. Load the entire batch of nodes at once via query with name IN(?) condition.
  3. Add them to your configuration.
  4. Flush session (and, optionally, commit / start new transaction).

Should be a lot faster as you'll be doing 1 query instead of 1000 and batching your inserts together.

ChssPly76
Configuration.addNode() just calls Configuration.nodes.add(Node). There's not really much business logic, except that there may be some names in <code>lotsOfNames</code> that don't have a matching Node.The batch update query approach sounds like a good idea. Will that still work if I pass it node names that might not exist?
Kyle Krull
Yes, very much so.`insert into CONFIGURATION_NODE(CONFIGURATION_ID, NODE_ID) select :configurationId, ID from NODE where NAME in (:name)`. You'll need to run this as native SQL and you'll still need to split your big list into batches of 1000 or so (unless you're obtaining it from some table to begin with in which case you can just join to it in this query)
ChssPly76
Thanks! Using batch updates in native SQL is much faster than the approach I was using before. I'd be curious to see if there's an HQL version of this that's more portable, but this works for now.
Kyle Krull
You can still define your native SQL as named query and execute it using Hibernate facilities (perhaps you do that already). As long as you keep to standard SQL it's as portable as it gets :-) You can't do this in HQL because your association table is not a first class citizen in Hibernate's eyes; so inserts to it will not be allowed.
ChssPly76