tags:

views:

327

answers:

4

Let's say you're creating a database to store messages for a chat room application. There's an infinite number of chat rooms (they're created at run-time on-demand), and all messages need to be stored in the database.

Would it be a mistaken to create one giant table to store messages for all chat rooms, knowing that there could eventually be billions of records in that one table?

Would it be more prudent to dynamically create a table for each room created, and store that room's messages only in that table?

+3  A: 

It would be proper to have a single table. When you have n tables which grows by application usage, you're describing using the database itself as a table of tables, which is not how an RDBMS is designed to work. Billions of records in a single table is trivial on a modern database. At that level, your only performance concerns are good indexes and how you do joins.

Rex M
To add to this answer... You can also federate a table based on some dimension. So mechanically, it looks like and performs like separate tables, but is seamlessly bound together using indexes. Again, more information from the OP is needed.
Mark Canlas
+2  A: 

Whilst a table per chat room could be performed, each database has limits over the number of tables that may be created, so given an infinite number of chat rooms, you are required to create an infinite number of tables, which is not going to work.

You can on the other hand store billions of rows of data, storage is not normally the issue given the space - retrieval of the information within a sensible time frame is however and requires careful planning.

You could partition the messages by a date range, and if planned out, you can use LUN migration to move older data onto slower storage, whilst leaving more recent data on the faster storage.

Andrew
+3  A: 

Billions of records?

Assuming you have constantly 1000 active users with 1 message per minute, this results in 1.5mio messages per day, and approx 500mio messages per year.

If you still need to store chat messages several years old (what for?), you could archive them into year-based tables.

I would definitely argue against dynamic creation of room-based tables.

devio
Agreed. Why would you need a database to log? Just use a .txt file to archive the chats. I'm assuming you'd need the logs for -legal- reasons which is why you probably thought of a database. A file system usage would be far better if you don't need to do read operations.
Zack
Sorry guys, but don't think so literally. This isn't the actual scenario but the premise is the same. Assume billions of records, regardless of the scenario you could imagine.
Chris
A: 

Strictly speaking, your design is right, a single table. fields with low entropy {e.g 'userid' - you want to link from ID tables, i.e following normal database normalization patterns}

you might want to think about range based partitioning. e.g 'copies' of your table with a year prefix. Or maybe even a just a 'current' and archive table

Both of these approaches mean that your query semantic is more complex {consider if someone did a multi-year search}, you would have to query multiple tables.

however, the upside is that your 'current' table will remain at a roughly constant size, and archiving is more straightforward. - {you can just drop table 2005_Chat when you want to archive 2005 data}

-Ace

phatmanace