views:

344

answers:

4

I want to store confidential data in a digitally signed file, so that I know when its contents have been tampered with.

My initial thought is that the data will be stored in NVPs (name value pairs), with some kind of CRC or other checksum to verify the contents.

I am thinking of implementing the creating (i.e. writing) and verification (reading) of such a file, using ANSI C++.

Assuming this is the data I want to store:

    //Unencrypted, raw data to be stored in file
    struct PrivateInfo {
         double age; weight;
         FitnessScale fitness;
         Location  loc;
         OtherStuff stuff;
    };

    //128-bit Encrypted Data (Payload to be stored in file)
    struct EncryptedData {
     // unknown fields/format ??

    };

[After I have read a few responses to this question]

Judging by the comments I have received so far, I fear people are getting side tracked by the word "licensing" which seems to be a red flag to most people. I suspected that may be the case, but in todays atmosphere of heightened security and general nervousness, I thought I'd better detail what I needed to be "hiding" lest someone thought I was thinking of passing on the "Nuke password" to some terrorists or something. I will now remove the word "license" from my question.

View it more as a technical question. Imagine I am a student (which I am), and that I am trying to find out about recommended (or best practices) for encoding information that needs to be secure.

Mindful of the above, I will reformat my questions thus:

  1. Given a struct of different data type fields, what is the "recommended" algorithm to give it a "reasonable secure" encryption (I still prefer to use 128 bit - but thats just me)
  2. What is a recommended way of providing a ROBUST check on the encrypted data, so I can use that check value to know if the contents of the file (the Payload of encrypted data) differs from the original.?
+1  A: 

Err, why not use a well known encryption system like GPG?

Alex
+7  A: 

First, note that "signing" data (to notice when it has been tampered with) is a completely separate and independent operation from "encrypting" data (to prevent other people from reading it).

That said, the OpenPGP standard does both. GnuPG is a popular implementation: http://www.gnupg.org/gph/en/manual.html

Basically you need to:

  • Generate a keypair, but don't bother publishing the public part.
  • Sign and encrypt your data (this is a single operation in gpg)
  • ... storage ...
  • Decrypt and check the signature (this is also a single operation).

But, beware that this is only any use if you can store your private key more securely than you store the rest of the data. If you can't guarantee the security of the key, then GPG can't help you against a malicious attempt to read or tamper with your data. And neither can any other encryption/signing scheme.

Forgetting encryption, you might think that you can sign the data on some secure server using the private key, then validate it on some user's machine using the public key. This is fine as far as it goes, but if the user is malicious and clever, then they can invent new data, sign it using their own private key, and modify your code to replace your public key with theirs. Their data will then validate. So you still need the storage of the public key to be tamper-proof, according to your threat-model.

You can implement an equivalent yourself, something along the lines of:

  • Choose a longish string of random characters. This is your key.
  • Concatenate your data with the key. Hash this with a secure hash function (SHA-256). Then concatenate the resulting hash with your data, and encrypt it using the key and a secure symmetric cipher (AES).
  • ... storage ...
  • Decrypt the data, chop off the hash value, put back the key, hash it, and compare the result to the hash value to verify that it has not been modified.

This will likely be faster and use less code in total than gpg: for starters, PGP is public key cryptography, and that's more than you require here. But rolling your own means you have to do some work, and write some of the code, and check that the protocol I've just described doesn't have some stupid error in it. For example, it has potential weaknesses if the data is not of fixed length, which HMAC solves.

Good security avoids doing work that some other, smarter person has done for you. This is the virtuous kind of laziness.

Steve Jessop
Thanks for your informative answer Steve. Its people like you that make StackOverflow the great site that it is
Stick it to THE MAN
"Good security avoids doing work that some other, smarter person has done for you. This is the virtuous kind of laziness.". He heh, I like that kind of laziness .. :)
Stick it to THE MAN
A: 

The answers to the edited question depend on the specific scenario.

For q1 (encryption): if you encrypt and decrypt at your servers you can use a symmetric key algorithm. Otherwise you may want to use public key cryptography.

For q2, if you simply want to check if a file has changed you can use any cryptographic hash such as SHA-1 -- assuming that you can make sure that the hash itself wasn't change.

If the data generator and the verifier are both secure you can use MAC algorithm such as HMAC to to verify that the data and the MAC match. But this works only if the secret key remains secret. Otherwise, you may be able to use digital signatures.

Amnon
SHA-1 is no longer regarded as cryptographically secure. I'd suggest using SHA-256 to generate a signing hash.
Andy Johnson
A: 

I'm going to change the phrasing of the question and see if it makes people happier (or I get downvoted). There are really two types of questions being asked:

  1. You are making some computer game and you want to know if someone has been messing with your save files. (data signing)

  2. You are writing a messaging program and want to keep people's message logs private. (data encryption)

I will deal with the second one (data encryption). It's a massively difficult topic and you should be looking for pre-built programs (such as PGP/GPG) even then it's going to take you a lot of time to understand and use properly. Think about encryption like this: it will be broken; your job is to make it not worth the effort. In other words make the effort required to break it greater than the value of the information.

As for the first one, again it can be broken. But a checksum is a good idea. see Amnon's answer for some links on that.

Hope this points you in the right direction. I'm not an expert on either topics but I hope this gives you a starting point. (you might want to re-phrase the question and see if you get some better answers)

James Brooks