ansaurus

Question

Performance consideration: Spread rows in multiple tables vs concentrate all rows in one table.

Answer 1

+3 A:

Be careful with preoptimizing databases. Most databases are reasonably fast and somewhat complicated. You want to run a test for efficiency first.

Second putting everything in one table makes it more likely that the results you want are in the cache which will speed up performance immensely. Unfortunately it also makes it much more likely that you will have to search a gigantic table to find what you are looking for. This can be partly solved with an index, but indexes don't come free (they make writing more expensive, for one).

My advice would be to make a test to see if the performance really matter and then test the different scenarios to see which is the fastest.

tomjen 2009-07-17 09:43:27

I am kinda strict with developing time, what would be the most recommended way for you opinion?

Shimmy 2009-07-17 09:47:09

The cleanest and most efficient way is No. 3 in the list, the question if this will be too slow.

Shimmy 2009-07-17 09:49:14

Answer 2

A:

Try to implement your data access layer in a way so that you can change from one database model to another if needed - that way you just pick one and worry about performance implications later.

Without doing some performance testing and having an accurate idea of the sorts of load your going to get its going to be difficult to optimise as the performance depends on a number of factors, such as the number of reads, the number of writes, and whether or not the reads and writes are likely to conflict and cause locking.

My preference would be for option 1 btw - its simplest to do and there are a number of tweaks you can do to help out fix various sorts of problems you might have.

Kragen 2009-07-17 09:49:15

Answer 3

+1 A:

If you're talking about large volumes of data (millions of rows+), then you will get have a benefit from using different tables to store them in.

e.g. basic example 50 million log entries, assuming 5 different "types" of log table Better to have 5 x 10 million row tables than 1 x 50 million row table

INSERT performance will be better with individual tables - indexes on each table will be smaller and so quicker/easier to be updated/maintained as part of the insert operation
READ performance will be better with individual tables - less data to query, smaller indexes to traverse. Also, sounds like you'd need to store an extra column to identify what type of Log entry a record is (Product, Shipping....)
MAINTENANCE on smaller tables is less painful (statistics, index defragging/rebuilding etc)

Essentially, this is about partitioning data. From SQL 2005 onwards, it has built in support for partitioning (see here) but you need Enterprise Edition for that, which basically allows you to partition data in one table to improve performance (e.g. you'd have your one Log table, and then define how the data within it is partitioned)

I listened to an interview with one of the eBay architects recently, who stressed the importance of partitioning when needing performance and scalability and I strongly agree based on my experiences.

AdaTheDev 2009-07-17 09:53:58

Answer 4

+1 A:

I would definitely go for option 3, for several reasons:

Data should be in the fields of a table, not as a table name (option 2) or a field name (option 1). That way the database gets easier to work with and easier to maintain.

Narrower tables genrally perform better. The number of rows has less impact on the performance than the number of fields.

If you have a field for each table (option 1), you are likely to get a lot of empty fields when only a few of the tables are affected by an operation.

Guffa 2009-07-17 10:18:12

You see, I agree with you, the question is especially about the insert AND the search, I don't care about the retrieval to be slow.

Shimmy 2009-07-17 10:22:23

ansaurus

tags:

views:

answers:

Performance consideration: Spread rows in multiple tables vs concentrate all rows in one table.

Performance consideration: Spread rows in multiple tables vs concentrate all rows in one table.

related questions