views:

107

answers:

3

I find that I am using a lot of join queries, especially to get statistics about user operations from my database. Queries like this are not uncommon:

from io in db._Owners where io.tenantId == tenantId
    join i in db._Instances on io.instanceId equals i.instanceId
    join m in db._Machines on i.machineId equals m.machineId
    select ...

My app is still not active, so I have no way of judging if these queries will be computationally prohibitive in real-life. My query:

  1. Is there a limit to when doing too many 'joins' is too much, and can that be described without getting real-life operating stats?
  2. What are my alternatives? For example, is it better to just create additional tables to hold statistics that are I update as I go, rather than pulling together different table sources each time I want statistics?
+12  A: 

If you do not have performance information then do not optimize.

Premature optimization is the root of all evil.

1) I don't think you'll ever reach the "limit". 2) This is called denomalization, premature denormalization is just wasted effort if you don't know if a problem exists.

I'd say your query looks pretty normal.

jfar
A: 

1) Is there a limit to when doing too many 'joins' is too much

No, the number of joins isn't an issue so much as the structure of the data within each table, presence and use of indexes and what needs to be done to get data out.

Normalized data is commonly a primary goal in relational DB design. You typically consider denormalization as a means of optimizing queries only as necessary because of the added effort required to maintain data consistency.

If you're really concerned, post your data model ERD (database tables & how they relate) and the database you are using for the project (because not all databases are the same).

OMG Ponies
@Ponies: Out of curiosity, why did you mark your answer as wiki?
Ken Redler
@Ken: So that I could edit it, of course!
hemp
Ponies sets it up; hemp drives it home.
Ken Redler
A: 

Unless you have very high traffic and indexes are properly set, etc., you shouldn't have problems.

For reporting/analysis, some places will create a data warehouse which in its most basic form is a [partially] denormalized copy of your main database. They are easier to report on since one table usually contain most, if not all, the data needed in a report. They can also be much faster to read from since you don't have to join so much. However, they'll require more disk space (duplicated data). If writes are allowed, they'll be slower (have to update all the duplicated data) and you'll have the problem of keeping that duplicated data consistent.

In other words, unless you're only doing reporting (or read-only access), keep the joins.

Nelson