ansaurus

Question

How many rows of data is too many rows of data?

Answer 1

+2 A:

Not really. It all depends on your business needs, and you'll have to buy the product that supports your estimated row count.

Otávio Décio 2009-03-18 21:08:59

Answer 2

+6 A:

I personally have had tables in production with 50 million rows, and this is small compared with I have heard. You might need to optimize your structure with partioning but until you test your system in your environment you shouldn't waste time doing that. What you described is preety small IMHO

I should add I was using SQL Server 2000 & 2005, each DBMS has its own sizing limitions.

JoshBerke 2009-03-18 21:09:38

Answer 3

+5 A:

Too Many == Just Enough + 1

Paul Tomblin 2009-03-18 21:10:56

(nothing constructive) hahaha thats an awesome answer :D

Anders 2009-03-18 21:17:28

Answer 4

+4 A:

The magic number is billions. Until you get to billions of rows of data, you're not talking about very much data at all.

Do the math.

4-12 rows per user per course,... hundreds of courses and thousands of users?

400,000 to 1,200,000 rows. Let's assume 1000 bytes per row.

That's 400Mb to 1.2Gb of data. You can buy 100Gb drives for $299 at the Apple store. You can easily spend more than $299 of billable time sweating over details that don't much matter any more.

Until you get to 1Tb of data (1,000 Gb), you're not talking about much data at all.

S.Lott 2009-03-18 21:11:34

Or an 80GB from Newegg for $33.99

Tmdean 2009-03-18 21:18:38

100gb drive for $299? Maybe 5 years ago! Today you can get 1Tb+ for $100!

rmeador 2009-03-18 21:19:04

Yeah, but he said "at the Apple store". You can hardly get a mouse for under $100 there.

P Daddy 2009-03-18 21:36:12

The point is that gloriously overpriced storage is cheap. Cheap storage is really cheap. Hand-wringing over storage is a waste of money.

S.Lott 2009-03-19 01:55:50

Answer 5

+5 A:

100 (courses) * 1000 (users) * 10 (records) is only a million. That's the low end, but a decent database ought to handle it okay.

What sounds iffy are Name/Value pairs. That will limit your ability to correctly index things, which will be critical to good performance.

Joel Coehoorn 2009-03-18 21:11:57

Answer 6

+2 A:

No, there isn't really any hard rule about how many rows you can have in a table, it depends a lot on how much data there is in the rows, and how well the data can be indexed.

A quick estimate on the figures that you stated gives something like tens of millions of rows. That's certainly not too much, but it's enough that it could be a problem if you aren't a bit careful.

Perhaps the table could be normalized? Does the same names occur a lot, so that you could put the names in a separate table and use the id in the table?

Guffa 2009-03-18 21:19:41

Answer 7

+1 A:

I don't think there is really a limit here, but drive space. BUT PLEASE add good indexes while its small, becuase when the table is huge indexes will take a lot longer to add. Plus if you have bad indexes queries will slow down as it gorws and people will complain when there is really nothing wrong, but a crappy to no index.

Jojo 2009-03-18 21:26:34

Answer 8

+2 A:

I once worked on a web form system with over 300 million rows in their name/value pair table. Many of the forms had over 300 rows per form submission. Performance wasn't too bad actually, but it was a total PITA to query from! My sql writing ability definitely improved over the life of this gig.

But IMHO, if you have any say get rid of it in favor of a standard normalized table.

John MacIntyre 2009-03-18 21:29:52

Answer 9

A:

I've worked on databases where we tried to create tables with 2B rows of data - that doesn't work, we got to 500M and re-designed. One of the biggest gotchas of working with such large table was the time taken to do deletions - I often see the approach where old records are archived and then deleted from the main table. If the table is big enough that deletion will run for many hours as the indexes are rebuilt.

Not sure where the cut off is but gut feel indicates a table > 10M rows is probably too big. Our approach was to partition data by date, so we ended up with a table for a week of data, and another summary table for months, and another summary for years - very common in DataWarehousing. BTW this was on SQL 7.0, interested to know if the DB's are better at this type of stuff yet?

MrTelly 2009-03-18 22:41:24

On Oracle you use partitioning. Data with different dates go to different partitions. Old partitions can be archived on tapes and dropped with something like "ALTER TABLE DROP PARTITION" in seconds.

jva 2009-07-31 17:58:30

Answer 10

+3 A:

No hard and fast rule, but there is a hard and fast way to get a number.

Write a program to populate your table with dummy data roughly approximating the expected form of the actual data (e.g. similar regularity, characters, patterns, etc.) Run performance tests against it using actual queries with the dummy data, gradually increasing the number of rows in the table, perhaps by steps of 1000 or 10000 rows.

At the cusp of when the query performance (e.g. queries completed per second) becomes unacceptable, you'll have your "too big" number of rows.

Triynko 2009-04-10 05:29:16

You can be creative generating the dummy data. If a table column consists of English text, flood it with random words from a dictionary. If it contains names, download a list of names, cross them to produce fake full names, then flood the table with them at the expected frequency.

Triynko 2009-04-10 05:36:53

+1 Nice practical tip there.

Wayne Koorts 2009-06-19 10:42:34

ansaurus

tags:

views:

answers:

How many rows of data is too many rows of data?

related questions