tags:

views:

219

answers:

7

I have a pretty large social network type site I have working on for about 2 years (high traffic and 100's of files) I have been experimenting for the last couple years with tweaking things for max performance for the traffic and I have learned a lot. Now I have a huge task, I am planning to completely re-code my social network so I am re-designing mysql DB's and everything.

Below is a photo I made up of a couple mysql tables that I have a question about. I currently have the login table which is used in the login process, once a user is logged into the site they very rarely need to hit the table again unless editing a email or password. I then have a user table which is basicly the users settings and profile data for the site. This is where I have questions, should it be better performance to split the user table into smaller tables? For example if you view the user table you will see several fields that I have marked as "setting_" should I just create a seperate setting table? I also have fields marked with "count" which could be total count of comments, photo's, friends, mail messages, etc. So should I create another table to store just the total count of things?

The reason I have them all on 1 table now is because I was thinking maybe it would be better if I could cut down on mysql queries, instead of hitting 3 tables to get information on every page load I could hit 1.

Sorry if this is confusing, and thanks for any tips.

alt text

+1  A: 

You'll need to compare performance testing results between the following:

  1. Leaving it alone
  2. Breaking it up into two tables
  3. Using different queries to retrieve the login data and profile data (if you're not doing this already) with all the data in the same table

Also, you could implement some kind of caching strategy on the profile data if the usage data suggests this would be advantageous.

Jon Seigel
All good points and I have done probably hundreds of hours of testing over the past 2 years on this same site, I got it pretty fast but now I am re-coding everything and this is the perfect chance to re-arrange any DB tables. The hard-er part is testing is and breaking it up and testing it because honestly the difference can not be that great, however when you have a lot of traffic and millions of mysql records it could change. thanks for the tips though
jasondavis
Absolutely. It might be helpful if you posted your testing results, as those are what really matter here.
Jon Seigel
A: 

I wouldn't consider your user table terrible large in number of columns, just my opinion. I also wouldn't break that table into multiple tables unless you can find a case for removal of redundancy. Perhaps you have a lot of users who have the same settings, that would be a case for breaking the table out.

Jay
+2  A: 

As long as you don't SELECT * FROM your tables, having 2 or 100 fields won't affect performance. Just SELECT only the fields you're going to use and you'll be fine with your current structure.

Patonza
I understand this, sorry I wasn't more clear in this case by large I mean by how manycolumns are in the user table
jasondavis
It's ok, what I meant is that number of columns in the table won't affect performance. On most DB Engines anyway... are you using InnoDB?
Patonza
actually I believe the current user table is not InnoDB but I will probably make this new table be InnoDB for the row locking vs table locking
jasondavis
+1  A: 

should I just create a seperate setting table?

So should I create another table to store just the total count of things?

There is not a single correct answer for this, it depends on how your application is doing.

What you can do is to measure and extrapolate the results in a dev environment.

In one hand, using a separate table will save you some space and the code will be easier to modify.

In the other hand you may lose some performance ( and you already think ) by having to join information from different tables.

About the count I think it's fine to have it there, although it is always said that is better to calculate this kind of stuff, I don't think for this situation it hurt you at all.

But again, the only way to know what's better your you and your specific app, is to measuring, profiling and find out what's the benefit of doing so. Probably you would only gain 2% of improvement.

OscarRyz
+2  A: 

You should consider putting the counter-columns and frequently updated timestamps in its own table --- every time you bump them the entire row is written.

Alex Brasetvik
A: 

Should take into account the average size of a single row, in order to find out if the retrieval is expensive. Also, should try to use indexes as while looking for data... The most important thing is to design properly, not just to split because "it looks large". Maybe the IP or IPs could go somewhere else... depends on the data saved there.

Also, as the socialnetworksite using this data also handles auth and autorization processes (guess so), the separation between login and user tables should offer a good performance, 'cause the data on login is "short enough", while the access to the profile could be done only once, inmediately after the successful login. Just do the right tricks to improve DB performance and it's done.

(Remember to visualize tables as entities, name them as an entity, not as a collection of them)

Alfabravo
thanks for the tips, could you ellaborate on what you mean by (Remember to visualize tables as entities, name them as an entity, not as a collection of them)? Thanks
jasondavis
Right. What I meant was (as an old habit, somewhat useful) to name the tables in singular, heh. No big deal, it just means you have a collection of rows, each one related to a... login,... user,... location,... and so on. Nothing special, though.
Alfabravo
A: 

Two things you will want to consider when deciding whether or not you want to break up a single table into multiple tables is:

  1. MySQL likes small, consistent datasets. If you can structure your tables so that they have fixed row lengths that will help performance at the potential cost of disk space. One thing that from what I can tell is common is taking fixed length data and putting it in its own table while the variable length data will go somewhere else.

  2. Joins are in most cases less performant than not joining. If the data currently in your table will normally be accessed all at the same time then it may not be worth splitting it up as you will be slowing down both inserts and quite potentially reads. However, if there is some data in that table that does not get accessed as often then that would be a good candidate for moving out of the table for performance reasons.

I can't find a resource online to substantiate this next statement but I do recall in a MySQL Performance talk given by Jay Pipes that he said the MySQL optimizer has issues once you get more than 8 joins in a single query (MySQL 5.0.*). I am not sure how accurate that magic number is but regardless joins will usually take longer than queries out of a single table.

Mike Lively