views:

522

answers:

2

Let's say I have entities A, B, C and each A has many B and C entities. I want to query a load of A entities based on some criterea, and I know I will be accessing all B and C entities for each A I return.

Something like select a from A as a join fetch a.b join fetch a.c would seem to make sense at first, but this creates a huge product if the numbers of B and C entities are large. Extending this to another associated entities makes the query totally unreasonable.

If I leave JPA to its own devices, I end up with n+1 selects when it wants to access the B and C entities.

What I thought I'd do was query A join fetch B, then A join fetch C, but this doesn't work as it gives me two List<A> results each with only half the information.

This is a pretty simple query in SQL terms, and I'm disappointed there isn't an obvious way to handle this. Am I missing something?

Provider is toplink essentials

+1  A: 

JPA should at least mention objects. The fact that you don't suggests to me that you're not going to be leveraging JPA to its fullest extent.

If you've got a legacy schema, and an object model doesn't make sense, perhaps you shouldn't be using JPA.

JPA isn't intended to be a substitute for SQL. It addresses that object-relational mismatch. If you don't have objects, just drop down to JDBC and SQL.

I don't know what your tables represent, but if you're thinking about objects you should be talking about 1:m and m:n relationships. Once you have those you can use caching, lazy and eager fetching to optimize populating the objects.

UPDATE: Write the query so each product has its options and prices lists as 1:m relationships and do eager fetching. That will avoid the (n+1) problem.

How can you say that relationships and eager fetching don't help here?

Try expressing the relationships in objects and have JPA show you the SQL it generates and compare it to what you'd write. If it's satisfactory, go for it. If not, drop down to JDBC and see if you can do better.

duffymo
Well the entities are the objects, aren't they. The point of ORM is that you have objects on the Java side, and a relational model on the DB side. I want to query a few products with their associated options and price lists, and I don't want it to produce pathologically inefficient queries. Caching and lazy/eager fetching don't help here - I need 1 query rather than n+1. I know up front that I will need all the data.
Draemon
@Draemon I think you mean that you need 2 queries rather than n+1. Because one would be a cartesian product, which you said you didn't want :-)
KLE
"...Well the entities are the objects, aren't they..." - not necessarily. There need not be a 1:1 mapping between objects and tables.
duffymo
@KLE: Yes. I meant that I wanted the A-B and A-C relations to be 1 query each instead of n+1 each. I want 2 queries overall.
Draemon
@duffymo: hmm. I would have said that an entity in JPA is always a single class/object, which may map to more than one table in the DB. I think it depends on whether you're talking about an entity in the JPA/abstract sense or traditional E/R sense.
Draemon
I'm talking about objects, which need not map to E/R entities.
duffymo
Sorry, should have said "...need not map 1:1 to E/R entities..." Relational databases are set-based; objects are more graph-based. This is the essence of the object-relational mismatch. They aren't synonyms.
duffymo
I understand that, but the JPA terminology defines "Entity" as the object side of the O-R divide. So 1 (JPA) Entity == 1 Object : 1 or more tables.
Draemon
Could be confusing nomenclature then. E/R diagrams are from the relational side of the house, and the "E" means "Entity".
duffymo
A: 

I wonder why you say this is pretty simple in SQL terms. Wouldn't you also have the cartesian product?


Using the Hibernate provider for JPA, an option you mention works:

query A join fetch B, then A join fetch C

You have two list of the same values, you use only one and it is fine (you just need to LEFT join).


In Hibernate, you can also ask to fetch the missing data in a second query.

Use fetch="subselect".

See https://www.hibernate.org/315.html


UPDATED after comment of the Original Poster:

In java, you could also do this by hand.

  1. Fetch the As with their collections of Bs, in a list called entityAs.
  2. Fetch the As with their collections of Cs (reusing part of the query, or using ids).
  3. Create a datastructure Map> for the second query (for performance, to avoid inner loop).
  4. Loop on list entityAs, using the Map to set the set Cs for each instance A.

This would have a good performance also.

If you run several times into this need, you could write a parameterized method to do this for you, so you only code it once.

As commented by the Original Poster, you need to detach all A entities from entityAs before modifying them, to be sure there will be no update send to the database...

KLE
"I wonder why you say this is pretty simple in SQL terms. Wouldn't you also have the cartesian product?"If I did it with one query, sure. I means I could easily do two queries and use the results from both together.I know that hibernate can do this, but I'm using toplink.
Draemon
@Draemon OK, thanks to clarify the SQL part.
KLE
Thanks for the update - this is similar to what I'm trying right now. Am I right in assuming I have to detach all the entities first otherwise step 4 will mark all As as dirty and cause chaos on commit?
Draemon
@Draemon You are right. A parameterized method would be nice. Note also that this **detaching** could be a problem in the rest of your code...
KLE