Few database design questions relating to user content site

views:

answers:

Few database design questions relating to user content site

Designing a user content website (kind of similar to yelp but for a different market and with photo sharing) and had few databse questions:

Does each user get their own set of tables or are we storing multiple user data into common tables? Since this even a social network, when user sizes grows for scalability databases are usually partitioned off. Different sets of users are sent separately, so what is the best approach? I guess some data like user accounts can be in common tables but wall posts, photos etc each user will get their own table? If so, then if we have 10 million users then that means 10 million x what ever number of tables per user? This is currently being designed in MySQL
How does the user tables know what to create each time a user joins the site? I am assuming there may be a system table template from which it is pulling in the fields?
In addition to the above question, if tomorrow we modify tables, add/remove features, to roll the changes down to all the live user accounts/tables - I know from a page point of view we have the master template, but for the database, how will the user tables be updated? Is that something we manually do or the table will keep checking like every 24 hrs with the system tables for updates to its structure?
If the above is all true, that means we are maintaining 1 master set of tables with system default values, then each user get the same value copied to their tables? Some fields like say Maximum failed login attempts before system locks account. One we have a system default of 5 login attempts within 30 minutes. But I want to allow users also to specify their own number to customize their won security, so that means they can overwrite the system default in their own table?

Thanks.

+1 A:

Users should not get their own set of tables. It will most likely not perform as well as one table (properly indexed), and schema changes will have to be deployed to all user tables.
You could have default values specified on the table for things that are optional.
With difficulty. With one set of tables it will be a lot easier, and probably faster.
That sort of data should be stored in a User Preferences table that stores all preferences for all users. Again, don't duplicate the schema for all users.

FrustratedWithFormsDesigner 2010-07-08 18:52:04

+1 A:

Generally the idea of creating separate tables for each entity (in this case users) is not a good idea. If each table is separate querying may be cumbersome.

If your table is large you should optimize the table with indexes. If it gets very large, you also may want to look into partitioning tables.

This allows you to see the table as 1 object, though it is logically split up - the DBMS handles most of the work and presents you with 1 object. This way you SELECT, INSERT, UPDATE, ALTER etc as normal, and the DB figures out which partition the SQL refers to and performs the command.

Not splitting up the tables by users, instead using indexes and partitions, would deal with scalability while maintaining performance. if you don't split up the tables manually, this also makes that points 2, 3, and 4 moot.

Here's a link to partitioning tables (SQL Server-specific): http://databases.about.com/od/sqlserver/a/partitioning.htm

programatique 2010-07-08 18:55:58

Querying would be done by concatenating some key (probably the user's name) to the table name. So, if the user is `Bob` the table for photos will be `Bob_Photos` (though the user's unique ID would be better than their name). It's doable, but only if nothing else will do (and I haven't seen that sort of situation yet).

FrustratedWithFormsDesigner 2010-07-08 18:58:54

I definitely think querying can be done, just think it may be cumbersome. I cleared up my answer a bit. Thanks.

programatique 2010-07-08 19:04:04

It doesn't make any kind of sense to me to create a set of tables for each user. If you have a common set of tables for all users then I think that avoids all the issues you are asking about.

dportas 2010-07-08 19:07:59

I am worried about table size. There is heavy user tracking and recording every user movement on site. If you think Google is bad you will hate us! Hence the database will become a monster to manage so I am not sure of the technical aspect how large tables can grow before we need to split it up or something.

Jarred 2010-07-08 22:22:57

It sounds like you need to locate a primer on relational database design basics. Regardless of the type of application you are designing, you should start there. Learn how joins work, indices, primary and foreign keys, and so on. Learn about basic database normalization.

It's not customary to create new tables on-the-fly in an application; it's usually unnecessary in a properly designed schema. Usually schema changes are done at deployment time. The only time "users" get their own tables is an artifact of a provisioning decision, wherein each "user" is effectively a tenant in a walled-off garden; this only makes sense if each "user" (more likely, a company or organization) never needs access to anything that other users in the system have stored.

There are mechanisms for dealing with loosely structured types of information in databases, but if you find yourself reaching for this often (the most common method is called Entity-Attribute-Value), your problem is either not quite correctly modeled, or you may not actually need a relational database, in which case it might be better off with a document-oriented database like CouchDB/MongoDB.

Adding, based on your updated comments/notes:

Your concerns about the number of records in a particular table are most likely premature. Get something working first. Most modern DBMSes, including newer versions of MySql, support mechanisms beyond indices and clustered indices that can help deal with large numbers of records. To wit, in MS Sql Server you can create a partition function on fields on a table; MySql 5.1+ has a few similar partitioning options based on hash functions, ranges, or other mechanisms. Follow well-established conventions for database design modeling your domain as sensibly as possible, then adjust when you run into problems. First adjust using the tools available within your choice of database, then consider more drastic measures only when you can prove they are needed. There are other kinds of denormalization that are more likely to make sense before you would even want to consider having something as unidiomatic to database systems as a "table per user" model; even if I were to look at that route, I'd probably consider something like materialized views first.

JasonTrue 2010-07-08 19:14:32

ansaurus

tags:

views:

answers:

Few database design questions relating to user content site

related questions