tags:

views:

117

answers:

6

Hello,

I've around 1GB of structured text data, (which is currently stored in database MySQL, mroe than 1million records) used by software. I've to ship this data along with the software.

I also need to protect this data. This data shouldn't be not accessible for users for other purposes. This software is just like an interface to data. like any database client.

Whats the best method to do so? Also kindly tell me pros and cons of these methods available so that I can decide myself next time.

EDIT::

I think my question is not clear. My software operates on this data. It provides query interface for this data. It shows data according to user according to his needs by querying the DBMS which has this data. My major concern here is not the size of the software (along with this data). I don't want user to redistribute this data. So, storing encrypted data in a zipped file is not a solution.

  1. So, I must use embedded DBMS like sqlite. But how to secure this? So, that user cannot distribute.
+2  A: 

Given that you already have the data in a database, you could send the user an encrypted SQLite database. There are various (paid) third party tools to encrypt data in SQLite transparently, or else you can just write encrypted data into SQLite in blobs, depending on the actual usage.

Also, depending on the technologies involved, you might be able to leverage the Microsoft Crypto API in SQLite ADO.NET driver, for instance.

Vinko Vrsalovic
A: 

You can compress it using some kind of compression I suggest use GZip. Also you can protect that data using some kind of encryption. I hope this will help. Since textual data can be compressed very well.

Lukas Šalkauskas
I've edited my question. Please go through once.
claws
You can encrypt your fields data in the database. Or something like that. If you want to be even more secure (avoid simple reflection of your libraries in .NET which does the encryption decryption, write library in pure C++, like a wrapper for encryption, which can't be reflected, or maybe there are some tools which can avoid library to be reflected I don't know this). But no one will secure you from disasm :)
Lukas Šalkauskas
A: 

Use whatever file format is the easiest to import and export. For MySQL, for example, that may simply be the output of mysqldump -- a long SQL file.

In interest of space, you compress the result, using soemthing like gzip.

For security, you then take the result of that and perform whatever security transformations you want to on it. For example, you can easily use GnuPG to encrypt the result (using either symetric or public key technologies).

Then, to restore it, you just run the operations in reverse. decrypt, decompress, and execute the SQL statements.

Any one of these technologies can be replaced with something else if appropriate. Each component should be chosen based on what will fit your situation the best. You may want, for example, to compress with LZMA, Zip, or Bzip compression instead. Or you may want to output/input CSV files instead of SQL.

tylerl
I think that his concern is that after the data is decrypted and in the SQL database (or wherever else), it will be accessible by the user...
Vinko Vrsalovic
I've edited my question. Please go through once.
claws
+3  A: 
  1. Ship it on a DVD accompanying your product.
  2. Make it available for download from a website.
  3. If bandwidth allows, provide a web service and have all your users connect to your central database via the web service instead of having standalone instances of your data.

Once you distribute data, it's more or less out of your control. You can do things like encode or encrypt it, but it's not a foolproof solution. The only real, sure-fire way to protect your data is to not give it out.

That having been said, there are a few possibilities:

  • Encrypt the database in some way. If the database isn't for reporting and doesn't need to have a lot of general queries run against it, you can encrypt at the row or field level. This will REALLY slow down the system, though.
  • If using an embedded database, encrypt the entire file. This will significantly slow down startup and shutdown time for your app, though.

Unfortunately, once you distribute your data, it's a losing battle to try to secure it. A better bet, if you can, is to provide a query web service so that your data are never archived en toto by a user - you just send them query results over the wire. You can adopt a user authentication strategy to verify that only legitimate users use the system, and maintain tighter control over your data that way.

EDIT: see my final paragraph. If you're distributing your data, you no longer have explicit control over who can come into possession of it, so any "control" of your data must be due to encryption or a similar protection. Unfortunately, you have to distribute a key at some point to the user (or the software the user's operating) in order to access the data. A sufficiently sophisticated user will then be capable of capturing the key and breaking your data protection. But, given that complete security is an impossibility once your data are in the hands of the user, you can give it your best effort by applying encryption. Your only two real options at that point are to encrypt data per row (or field?) or to encrypt the entire embedded database file.

Dathan
I've edited my question. Please go through once.
claws
+1  A: 

Just write down Please Do NOT Crack on the sticker.

S.Mark
lol.. good one man!!
claws
:-) Actually, its one of the proper way. Its works depends on the person.
S.Mark
+1. Yes it's a joke answer but come on, those anti-reverse-engineering laws are also jokes.
overslacked
+1 :-) Exactly. If your program and data is installed on a user controlled machine there is no way you can stop user from decrypting the data. The best you can do is make it hard enough so the won't bother trying because it's not worth it.OTOH some form of EULA will make it possible to you to take legal actions if user redistributes your data.
Tomek Szpakowicz
+3  A: 

It sounds like you're looking for a DRM mechanism: you're planning to put locked data, plus the key to that lock, in the hands of users, with something to try to prevent them from using the key in a way you don't like. So any software-based method you come up with will have the same limitations as existing DRM systems.

The only two things I can think of that might actually work would be:

  1. keeping the data on your own servers, and then trying to limit calls to it (because if too many users query too much, they could build their own copy of the database), or

  2. shipping it to users but with a physical protection mechanism, which is kind of like #1 except you're putting it in their hands, and then sealing it such that it self-destructs if you try to open it (tech support is going to be fun!)

Any software-only encryption is a temporary solution at best: for it to be at all useful, the data needs to exist at some point in time in unencrypted form for your software to use it, and that means they can read it themselves.

Ken