views:

45

answers:

1

I'm building a system using django, Sphinx and MySQL that's very quickly becoming quite large. The database currently has about 2000 rows, and I've written a program that's going to populate it with another 40,000 rows in a couple days. Since the database is live right now, and since I've never had a database with this much information in it, I'm worried about some things:

  1. Is adding all these rows going to seriously degrade the efficiency of my django app? Will I need to go back through it and optimize all my database calls so they're doing things more cleverly? Or will this make the database slow all around to the extent that I can't do anything about it at all?

  2. If you scoff at my 40k rows, then, my next question is, at what point SHOULD I be concerned? I will likely be adding another couple hundred thousand soon, so I worry, and I fret.

  3. How is sphinx going to feel about all this? Is it going to freak out when it realizes it has to index all this data? Or will it be fine? Is this normal for it? If it is, at what point should I be concerned that it's too much data for Sphinx?

Thanks for any thoughts.

+1  A: 

For ordinary queries 2000 rows is nothing - even without an index it will be very fast as the entire table can be cached in memory. 100000 rows should work fine too in most situations, although if you do not have appropriate indexes or your queries aren't using the available indexes then you will notice it by now - queries that should take seconds could take minutes if they don't use indexes correctly. But it shouldn't take long to fix the problem - run EXPLAIN on your slow query and see why it is slow, and figure out what indexes you need.

By the time you get to millions of rows or tens of millions of rows, then you will have to think more carefully about your database design and your indexing strategy. It's possible to have hundreds of millions of rows in a table if you do things right.

Mark Byers
So, as this relates to django, the db was created for me. Does django do the above indexing creation correctly?
mlissner
No, you have to create the indexes yourself.
Mark Byers
Cool. I did this, and my lookup times have improved dramatically.
mlissner