views:

171

answers:

1

Hi, I am working on a project, involving large amount of data from the delicious website.The data available is at files are "Date,UserId,Url,Tags" (for each bookmark). I normalized my database to a 3NF, and because of the nature of the queries that we wanted to use In combination I came down to 6 tables....The design looks fine, however, now a large amount of data is in the database, most of the queries needs to "join" at least 2 tables together to get the answer, sometimes 3 or 4. At first, we didn't have any performance issues, because for testing matters we haven't had added too much data in the database. No that we have a lot of data, simply joining extremely large tables does take a lot of time and for our project which has to be real-time is a disaster.I was wondering how big companies solve these issues.Looks like normalizing tables just adds complexity, but how does the big company handle large amounts of data in their databases, don't they do the normalization? thanks

+4  A: 

Since you asked about how big companies (generally) approaches this:

They usually have a dba(database administrator) who lives and breathes the database the company uses.

This means they have people that know everything from how to design the tables optimally, profile and tune the queries/indexes/OS/server to knowing what firmware revision of the RAID controller that can cause problems for the database.

You don't talk much about what kind of tuning you've done, e.g.

  • Are you using MyISAM or InnoDB tables ? Their performance(and not the least their features) is radically different for different workloads.
  • Are the tables properly indexed according to the queries you run ?
  • run EXPLAIN on all your queries - which will help you identify keys that could be added/removed, wether the proper keys are selected, compare queries(SQL leaves you with lots of way to accomplish the same things)
  • Have you tuned the query-cache ? For some workloads the query cache(default on) can cause considerable slowdown.
  • How much memory do your box have , and is mysql tuned to take advantage of this ?
  • Do you use a file system and raid setup geared towards the database ?
  • Sometimes a little de-normalization is needed.
  • Different database products will have different charasteristics, MySQL might be blazingly fast for some worlkoads, and slow for others.
leeeroy
Thanks for you reply. Honestly, I was not thinking of the things that you mentioned here. Now I realized that I have ignored a lot when I was designing the database. The database uses MyISAM. I will read about the things you mentioned. thanks alot
Hossein