No worries! It looks more complex than it actually is! Just get down to the drinks!
TLDR-version: How to efficiently query and update entities having relationships to other entities?
Here's an interesting data modeling scenario with two tables that has been puzzling me:
Entities { ID, Name, ScalarValue }
ComponentEntities { AggregateEntityID, ComponentEntityID, quantity }
AggregateEntityID
and ComponentEntityID
are foreign keys to the Entities
table.
Give me the bloody example already
Drinks { ID, Name, Alcohol% }
DrinkIngredients { CocktailID, IngredientID, amount }
Drinks { 1, "Vodka", 40% }
Drinks { 2, "Tomato juice", 0% }
Drinks { 3, "Tabasco", 0% }
Drinks { 4, "Bloody mary", - }
DrinkIngredients { 4, 1, 0.2 } // Bloody mary has 0.2*Vodka
DrinkIngredients { 4, 2, 0.7 } // Bloody mary has 0.7*Tomato juice
DrinkIngredients { 4, 3, 0.1 } // Bloody mary has 0.1*Tabasco
If we wanted to get Bloody Mary's alcohol contents, we would SELECT * FROM DrinkIngredients WHERE CocktailID == 4
.
Pretty standard; nothing weird there. Lisa likes to make it a bit sweeter by adding some Passion to it:
Drinks { 6, "Passion", 13% }
Drinks { 7, "Bloody Mary Pink", - }
DrinkIngredients { 7, 4, 0.8 } // Bloody Mary Pink has 0.8*Bloody Mary
DrinkIngredients { 7, 6, 0.2 } // Bloody Mary Pink has 0.2*Passion
Lisa's mum has been tasting these for so long that she believes she has found the ultimate blend between the two:
Drinks { 8, "Bloody Milf", - }
DrinkIngredients { 8, 4, 0.45 } // Bloody Milf has 0.45*Bloody Mary
DrinkIngredients { 8, 7, 0.55 } // Bloody Milf has 0.55*Bloody Mary Pink
Add couple more of these consists of levels and we have a deep relational recursion. The only restriction is that entity cannot consist of itself.
This seem to form a directed acyclic graph.
RDBMS: One way to "cache" the data would be to calculate relevant data and store it in the Entity itself (or perhaps in another table). In the example above, the alcohol content for Bloody Mary would calculated once when it's created and stored in its Alcohol% field. In this case, updates become expensive because we have to update every drink (along with the whole dependency hierarchy) consisting of the updated one.
Questions
RDBMS: Is there a better way to get to the leaf values (drinks that don't consist of other ones) than getting the "parent" drink until a leaf drink is reached?
Both, RDBMS and NoSQL, have a problem with this: one way or the other.
Bottom-line: is this even practical and feasible?
What I need is a counter-inception