I have large amounts of scientific data that I need to store (150 TB+ starting data) and I want to know the best way to store the data (nosql or RDBMS etc...)
Any tips......
James
I have large amounts of scientific data that I need to store (150 TB+ starting data) and I want to know the best way to store the data (nosql or RDBMS etc...)
Any tips......
James
This really depends on what you need to do with the data on a later time. If the data is a collection of a few very large files then the a normal file system would be ok. If you need to be able to search and analyse the data then a database might be the best solution.
I am working with large datasets as well in a scientific environment. Most of this data is tabular and when we started we stored every datapoint is a table. We found it to be much easier in the end to zip the tables and store this in a binary blob into the database. In a separate table we stored the metadata about this tables.
Answer this question to choose from NoSQL or a RDBMS : "Are my data structured in relationships?"
There are special db's for scientific data: http://www.dbms2.com/2009/09/12/xldb-scid/
Does it have to be one database type? Part of NoSQL means one size does not fit all, so why not two or more NoSQL? How about one column store and one graph database?