views:

198

answers:

5

With 10 Tables, I would have no joins. With 100 Tables, I would have one join per query. Which would show better performance?

+6  A: 

I think this depends a lot on your DB schema, but 10k rows is not a lot for a table. If you can put an index on the data, do that. I think less tables should make your application much simpler.

Also, to state the obvious, joins are more expensive than not-joins because to compute a join you need to take the cross-product (or whatever its called) of two tables and then take rows from that. But again, I don't know what your data looks like.

danben
+1 for "not a lot". 10K rows? That's a pittance.
duffymo
that was just an example. Next time I write " k" behind it ;)
openfrog
+6  A: 

I wouldn't make a design decision this way without some measured performance data.

The proper way to model a problem is to create normalized tables with indexes that faithfully model the problem domain.

Once you have that, get some performance data for queries that you'll need to run.

If you find that performance isn't acceptable, denormalize as needed.

Your question is too generic and general to make a black and white decision.

duffymo
+1 for avoiding premature optimization
danben
+2  A: 

Joins have performance implications. But also, having redundant data is a bad practice. Updating and inserting data would be very taxing in those cases.

Daniel A. White
A: 

Fewer joins means faster select queries. But if you're doing any inserts or updates, you'll most likely pay through data anomolies, or much more expensive inserts/updates.

If it's just static data you are just going to query, then denormalization could pay off, but otherwise you probably shoot yourself in the foot.

Shlomo
A: 

To start with the right schema, One table with 100,000 rows, if all you have is one logical entity....

Otherwise, anlayze your domain and design your schema first and foremost to mirror the logical domain entities it must represent. Then, only denormalize to address those performance issues that actually present themselves in load testing, (or that, from past experience, you know will present themselves) This approach, (starting from the right normalized schema), will make the tuning process itself easier, it will help guarantee that what you end up with will contain an optimum blend of normalization and optimizations, and it will ensure that you understand what compromises to normalization have been made for performance. This latter point is a good thing because it allows you to more intelligently add those necessary application validations to address those cases where normalization has been compromised, and where therefore, your database is vulnerable to data duplication or inconsistency.

If all you care about is read performance, though, then again, your best choice is just one table with 100,000 rows - and by the way, don't bother using a relational database, there's no point, just store the data in memory.

Charles Bretana