views:

210

answers:

7

I was asked to think about a database for our application. There are about 7 different data need to be stored. One is identification data which may contains an unique serial number, time, location. All other 6 data sets (4 binary raw data, 2 text data) must be identified by the identification data. 3 of them are about 2 MB a record, others just few KB.

The maximum number of records to store is 1500. About 6 MB per row, so the maximum total data will be about 9 GB.

I am thinking to have just one table. But it looks to me is ugly with one table about 9 GB data.

Do you have a database like that? We may use the MySQL RDBMS.

A: 

One way you could break this is is by having one table that just has the serial number, time, and location. Then, in other tables, store the data sets with a foreign key referencing that table.

It is hard to provide more advice without more information about how your database will be used.

McPherrinM
+7  A: 

I'm having a hard time visualizing your data structure, but I'll give this as a general rule of thumb:

As long as the table can't be normalized any more, there is nothing inherently wrong with a single table of this size, other than performance. However, if the data is normalized and you've considered all other factors, I don't have a lot of suggestions.

However, I would look at whether it is really necessary to store the binary data in the database. If these are picture files, or documents, or something that you could store on the file system, I would recommend storing the files on the file systems and paths in the database. (This is a topic that comes up repeatedly on this site, and I agree with the majority that storing BLOBS when unnecessary is a bad idea.)

David Stratton
how about back up and security if save binary data in file and put path into database?
5YrsLaterDBA
Backups should be a part of normal operations, whether you're a single PC owner or a large corporation. Security is something you need to understand if you're a developer/DBA, what have you. If that's not in your realm of expertise, I would recommend hiring someone with that expertise or getting someone experienced to assist, as security is not something you want to learn by trial and error.
David Stratton
+1. Unstructured BLOBs don't lend themselves to being processed by SQL. Look in to an External Blob Store (EBS). In MySQL, under Settings you can add EBS as a storage engine.
fatcat1111
A: 

Think about the data you need to retrieve and what's needed in each query, particularly the common queries, from there work out your optimal database structure. Also, do you need to have the binary data retrieved with all the other elements? Could that be file-based with just file paths stored in the DB. The slimmer the data profile, the faster your queries and overall performance should be.

Lazarus
+2  A: 

I think MySQL will die with such workload :) And relational databases are not good for such tasks.

So your description looks very like Google Big Table. You can find interesting presentation on YouTube about it.

Take a look on open source implementation of such database Hadoop

Elalfer
+1. Interesting answer. I'll need to check out Google Big Table. Thank you.
David Stratton
are you saying the 9GB is too big for MySQL or one table plus 9GB will cause MySQL hard time?
5YrsLaterDBA
Maybe 9GB database is OK for MySQL, but big BLOB data could be a problem.
Elalfer
+1  A: 

I would change Daniel's solution slightly: create one table for the simple data fields, and then create separate tables for each type of binary objects. Like this:

Records
=======
SerialNumber (PK)
Time
Location
Text1
Text2

Images
=======
SerialNumber (FK)
Image1
Image2

Files
=======
SerialNumber (FK)
File1
File2
Josh Yeager
A: 

It's hard to reply without more information about what the data actually is.

But when you say that the identifier "may contains an unique serial number, time, location", that sounds likes you're putting three different things into one field, which is almost always a bad idea. Do you mean that sometimes it is a serial number, sometimes it is a time, and sometimes it is a location? If you have three different identifiers for an object, then it is very likely that this is three different kinds of object, and should be three different tables. When someone tells me that the primary key of their table is a string that sometimes holds customer account number, sometimes store zip code, and sometimes product part number, my immediate reaction is that this should be three tables: one for customers, one for stores, and one for products.

Likewise, what is this "other data" that you are storing? If it's always the same kind of Blob, than a single table makes sense. If they're different, then forcing them into the same table is likely a mistake.

Well, maybe that's not what you meant, I can't be sure.

Jay
A: 

The BLOB and TEXT columns in a table are not stored 'inline' - i.e. they do not contribute to the table size. BLOB/Binary data and Text are stored/managed by MySQL in a separate area of memory.

If you exclude those 2 kinds of fields, your rowsize will be 100 bytes (let's say). For 1500 rows, that will come to 150,000 bytes - 150 kilobytes. The BLOB and Text are part of the table, but in a sense they don't contribute to your 9 GB table size.

Hence don't worry or feel bad about the table size.

blispr