I'm thinking about table layout for a Hibernate-managed class hierarchy, and certainly the table per subclass technique strikes me as the most appropriate in a general sense. However, thinking through the logic I have some concerns about its performance especially as the number of subclasses scale.
To give a very brief (and classic) example, let's say you have the following classes:
public abstract class Animal {
int pkey;
String name;
}
public class Dog extends Animal {
long numSlippersChewed; // int is not large enough...
}
public class Cat extends Animal {
short miceCaught; // ... but here int is far bigger than required :-)
}
(I'm eliding getters and setters and Hibernate mappings etc., just assume they're the basic obvious case).
The database tables for these entities make sense, you get nice denormalisation and so on. However, what query does Hibernate do in order to pull out an individual animal? I can think of at least two cases where this might happen:
- Some other entity having a one-to-one (or one-to-many) mapping, such as a
pet
field of aHuman
class. This would store the pkey, so when Hibernate fetches a Human object it will need to fetch the correspondingAnimal
object too. When given the pkey of the animal, what query(/ies) will Hibernate use to extract and unmarshall the actual Animal data, given that it could reside in theCat
orDog
tables? - HQL such as
from Animal where name='Rex'
(let's assume names are unique). This is similar to the above in that it lets you identify a row in the superclass table but you don't know which subclass table to inspect for further details. Does HQL even let you issue a queryfrom
an abstract class? (Using subclass specific stuff works nicely though, e.g.from Cat where miceCaught > 5
).
I can think of two ways that this could be done in SQL and neither seems pretty. One is to run an exists
query on each subclass table for the given pkey and then load from the table that returned a hit. Alternatively Hibernate could perform some horrible union query joining in all the tables - essentially simulating the table-per-hierarchy scheme in that the result set would include attributes for all possible subclasses with the individual selects from the subclass tables returning null
for the irrelevant arguments. This latter case would probably even need to add a synthetic discriminator column so that Hibernate could know which subclass table actually returned the row and thus what Java class they should be parsed into.
Things get hairier too if you have subtypes of concrete types:
public class Greyhound extends Dog {
float lifetimeRacingWinnings;
}
Now for a given animal pkey, there may be valid rows in the Dog
and Greyhound
tables, meaning that my first approach of manually checking the class that corresponds to a pkey gets a lot tougher.
The reason I'm so concerned is that I will be wanting to use this approach on a class hierarchy with about 70 classes with a maximum nesting chain of 4-5 levels, so performing a union query on all of that is likely to have horrible performance. Does Hibernate have any tricks up its sleeve to keep this relatively performant? Or is loading a reference to one of these classes by pkey going to take a long time?