views:

52

answers:

4

What I have:

  • A MySQL database running on Ubuntu that maintains a large table of articles (similar to wordpress).
  • Need to relate a given article to another set of data. This set of data will be fairly large.
  • There maybe various sets of data that will be related.

The query:

  • Is it better to contain these various large sets of data within the same database of articles, which will have a large set of traffic on it?

or

  • Is it better to create different databases (on the same server) that relate by a primary key to the main database with the articles?
+2  A: 

Put them all in the same DB initially, until you find that there is a performance issue. Much easier than prematurely optimising.

Modern RDBMS are very good at optimising data access.

Visage
Totally agree. Just make your code flexible enough to work with both databases, and when (if) the time comes, split them up.
Pedro
+1 for stating the good rule of: Do not optimize until you have a problem. After that read all the mysql manual about optimizing and analyze what is taking time before randomly optimizing.
Thirler
A: 

If you need to connect frequently and read both of the records, you should put in a the same database. The server then won't have to run permission checks twice for each of your databases.

If you have serious traffic, you should consider using persistent connection for that query.

If you don't need to read them together frequently, consider to put on different machine. As the high traffic for the bigger database won't cause slow downs on the other.

Pentium10
Hi all thanks for your replies. You have given me more to think about. Currently we are running on 1 dedicated server.I like the view about running it in the same database for now (separate table obviously) and not preempting problems, just allow for future upgrades.I think the future upgrade path sounds like putting the related data on a separate machine, as Pentium 10 suggests.I mean the main database may have a million records and the related data may have the same if not more records. So I think this is the course of action.Does that sound right to people?ps. thanks again.
Jayp
A: 

Different databases on the same server gives you all the problems of a distributed architecture without any of the benefits of scaling out. One database per server is the way to go.

APC
A: 

When you say 'same database' and 'different databases related' don't you mean 'same table' vs 'different tables'?

if that's the question, i'd say:

  • one table for articles
  • if these 'other sets of data' are all of the same structure, put them all in the same table. if not, one table per kind of data.
  • everything on the same database

if you grow big enough to make database size a performance issue (after many million records and lots of queries a second), consider table partitioning or maybe replacing the biggest table with a key/value store (couchDB, mongoDB, redis, tokyo cabinet, [etc][6]), which can be a little faster than MySQL but a lot easier to distribute for performance.

[6]:key-value store

Javier