ansaurus

Question

Preferred way of retrieving row with multiple relating rows

Answer 1

+2 A:

The best performance/mess ratio is 42.

On a more serious note, go with the simplest solution: retrieve everything with a single query. Don't optimize before you encounter a performance issue. "Premature optimization is the root of all evil" :)

Andomar 2010-02-07 18:16:58

42 sounds good enough to me. This explains the slow site of the illuminati. ;)Actually I try to optimize few important queries, which pose a bottleneck, eventually. Also, I want to do it right, next time I write the stuff from the start.

Wikser 2010-02-07 19:09:25

Answer 2

A:

One stored proc that returns 2 datasets: "recipe header" and "recipe details"?

This is what I'd do if I needed the data all at once in one go. If I don't need it in one go, I'd still get 2 datasets but with less data.

We've found it slightly easier to work with this in the client rather than one big query as Andomar suggested, but his/her answer is still very valid.

gbn 2010-02-07 18:31:30

Answer 3

+1 A:

If you only need to join two tables and an "ingredient" isn't a huge amount of data, the best balance of performance and maintainability is likely to be a single joined query. Yes, you are repeating some data in the results, but unless you have 100,000 rows and it's overloading the database server/network, it's too soon to be optimizing.

The story is a little bit different if you have many layers of joins each with decreasing cardinality. For example, in one of my apps I have something like the following:

Event -> EventType -> EventCategory
                   -> EventPriority
                   -> EventSource   -> EventSourceType -> Vendor

A query like this results in a significant amount of duplication which is unacceptable when there are 100k events to retrieve, 1000 event types, maybe 10 categories/priorities, 50 sources, and 5 vendors. So in that case, I have a stored procedure that returns multiple result sets:

All 100k Events with just EventTypeID
The 1000 EventTypes with CategoryID, PriorityID, etc. that apply to these Events
The 10 EventCategories and EventPriorities that apply to the above EventTypes
The 50 EventSources that generated the 100k events
And so on, you get the idea.

Because the cardinality goes down so drastically, it is much quicker to download only what is needed here and use a few dictionaries on the client side to piece it together (if that is even necessary). In some cases the low-cardinality data may even be cached in memory and never retrieved from the database at all (except on app start or when the data is changed).

The determining factors in using an approach such as this are a very high number of results and a steep decrease in cardinality for the joins, in other words fanning in. This is actually the reverse of most usages and probably the reverse of what you are doing here. If you are selecting "recipes" and joining to "ingredients", you are probably fanning out, which can make this approach wasteful, especially if there are only two tables to join.

So I'm just putting it out there that this is a possible alternative if performance becomes an issue down the road; at this point in your design, before you have real-world performance data, I would simply go the route of using a single joined result set.

Aaronaught 2010-02-07 18:38:23

I should probably have pointed out that the only reason I sometimes use this approach is because I'm mapping it to a domain model where many `Event` instances share the exact same `EventType` reference. This is only worth doing when you (a) have a separate domain model and (b) don't have to rebuild all of the redundant data on the front-end.

Aaronaught 2010-02-07 19:01:46

Thanks for your detailed answer. I always felt a bit weird writing the code that neglects a considerable amount of the returned data.In all cases, I assume it's most important to reduce the number of queries. The round-trip- and initialization-time should be a large factor in such data-retrivals.

Wikser 2010-02-07 19:27:22

Answer 4

A:

I would look at the bigger picture - do you really need to retrieve ingredients for 200 recipes? What happens when you have 2,000?

For example, if this is in a web page I would have the 200 recipes listed (if not less because of paging), and when the user clicked on one to see the ingredient then I would get the ingredients from the database.

If this isn't doable, I would have 1 stored proc that returns one DataSet containing 2 tables. One with the recipes and the second with the list of ingredients.

JBrooks 2010-02-07 18:49:04

Of course, in many cases it would result in a better performance to retrieve the additional data on damand. I'm doing this already, when it's reasonable. But I'm reluctant to write extra DAL functionality for every processing function.

Wikser 2010-02-07 19:13:38

Answer 5

A:

"I'm currently hand-writing a DAL in C#..." As a side note, you might want to check out the post: Generate Data Access Layer Methods From Stored Procs. It can save you a lot of time.

JBrooks 2010-02-07 18:53:05

ansaurus

tags:

views:

answers:

Preferred way of retrieving row with multiple relating rows

related questions