I have a scenario where my application has access to a session for limited time windows, during which it must fetch data from the database into memory, and then only use the in-memory data to serve requests.
The data model is a simple one-to-many association such as:
<class name="com.foo.Road" table="road">
<id name="oid" column="oid"/>
<map name="carCountMap" fetch="subselect">
<key column="road_oid" foreign-key="oid"/>
<index column="time_oid" type="long" />
<one-to-many class="com.foo.CarCount" />
</map>
<map name="truckCountMap" fetch="subselect">
<key column="road_oid" foreign-key="oid"/>
<index column="time_oid" type="long" />
<one-to-many class="com.foo.TruckCount" />
</map>
</class>
Now suppose that the car and truck count data exists for several years, which is much more than can fit in memory. Furthermore, I'm only really interested in loading the car counts in the last 3 months.
My question is, what is the best way to load this data using hibernate such that:
- road.getCarCountMap() returns the set of only the car counts in the last 3 months (may be empty)
- I don't end up with some crazy cartesian product that takes ages to process
- No
LazyInitializationException
s are thrown after I close the session
Some things I've tried are:
1. Make the carCountMap
collection eager and specify a where
attribute on the mapping such as:
<map name="carCountMap" fetch="subselect" lazy="false" where="time_oid > 1000">
(similar for truckCountMap
)
This fits in best with the collection semantics that I want, but unfortunately it forces me to hardcode a value, so I can't really refer to the last 3 months. time_oid
increases by 1 every day.
2. Define maps as lazy and use an hql query to manually join the 3 tables:
from Road r
left outer join fetch r.carCountMap ccm
left outer join fetch r.truckCoutnMap tcm
where (ccm.time.oid > :startDate)
or (tcm.time.oid > :startDate)
The problem with this is that the resulting query returns several millions of rows, whereas it should be 10k roads * 4 measurements per month (weekly) * 3 months = ~120k. This query completes in about an hour, which is ridiculous as approach #1 (which loads the exact same data as far as I'm concerned) completes in 3 minutes.
3. Define maps as lazy and load the roads first with a criteria, then run additional queries to fill in the collection
List roadList = session.createCriteria(Road.class).list();
session.getNamedQuery("fetchCcm").setLong("startDate", startDate).list();
session.getNamedQuery("fetchTcm").setLong("startDate", startDate).list();
return roadList;
This fires the right queries, but the retrieved car and truck counts don't get attached to the Road
objects in roadList
. So when I try to access the counts on any Road object, I get a LazyInitializationException
.
4. Define maps as lazy, use criteria.list()
to load all roads, iterate through all measurement dates in the last 3 months so force those values to be loaded.
I haven't tried this yet because it sounds really clunky, and I'm not convinced it'll get rid of the LazyInitializationException
- Are there any workarounds to the problems I've run into with these methods?
- Is there a better method altogether?