views:

1633

answers:

2

I have a job processing architecture based on AWS that requires EC2 instances query S3 and SQS. In order for running instances to have access to the API the credentials are sent as user data (-f) in the form of a base64 encoded shell script. For example:

$ cat ec2.sh
...
export AWS_ACCOUNT_NUMBER='1111-1111-1111'
export AWS_ACCESS_KEY_ID='0x0x0x0x0x0x0x0x0x0'
...
$ zip -P 'secret-password' ec2.sh
$ openssl enc -base64 -in ec2.zip

Many instances are launched...

$ ec2run ami-a83fabc0 -n 20 -f ec2.zip

Each instance decodes and decrypts ec2.zip using the 'secret-password' which is hard-coded into an init script. Although it does work, I have two issues with my approach.

  1. 'zip -P' is not very secure
  2. The password is hard-coded in the instance (it's always 'secret-password')

The method is very similar to the one described here

Is there a more elegant or accepted approach? Using gpg to encrypt the credentials and storing the private key on the instance to decrypt it is an approach I'm considering now but I'm unaware of any caveats. Can I use the AWS keypairs directly? Am I missing some super obvious part of the API?

+1  A: 

You can store the credentials on the machine (or transfer, use, then remove them.)

You can transfer the credentials over a secure channel (e.g. using scp with non-interactive authentication e.g. key pair) so that you would not need to perform any custom encryption (only make sure that permissions are properly set to 0400 on the key file at all times, e.g. set the permissions on the master files and use scp -p)

If the above does not answer your question, please provide more specific details re. what your setup is and what you are trying to achieve. Are EC2 actions to be initiated on multiple nodes from a central location? Is SSH available between the multiple nodes and the central location? Etc.


EDIT

Have you considered parameterizing your AMI, requiring those who instantiate your AMI to first populate the user data (ec2-run-instances -f user-data-file) with their AWS keys? Your AMI can then dynamically retrieve these per-instance parameters from http://169.254.169.254/1.0/user-data.


UPDATE

OK, here goes a security-minded comparison of the various approaches discussed so far:

  1. Security of data when stored in the AMI user-data unencrypted
    • low
    • clear-text data is accessible to any user who manages to log onto the AMI and has access to telnet, curl, wget, etc. (can access clear-text http://169.254.169.254/1.0/user-data)
    • you are vulnerable to proxy request attacks (e.g. attacker asks the Apache that may or may not be running on the AMI to get and forward the clear-text http://169.254.169.254/1.0/user-data)
  2. Security of data when stored in the AMI user-data and encrypted (or decryptable) with easily obtainable key
    • low
    • easily-obtainable key (password) may include:
      • key hard-coded in a script inside an ABI (where the ABI can be obtained by an attacker)
      • key hard-coded in a script on the AMI itself, where the script is readable by any user who manages to log onto the AMI
      • any other easily obtainable information such as public keys, etc.
      • any private key (its public key may be readily obtainable)
    • given an easily-obtainable key (password), the same problems identified in point 1 apply, namely:
      • the decrypted data is accessible to any user who manages to log onto the AMI and has access to telnet, curl, wget, etc. (can access clear-text http://169.254.169.254/1.0/user-data)
      • you are vulnerable to proxy request attacks (e.g. attacker asks the Apache that may or may not be running on the AMI to get and forward the encrypted http://169.254.169.254/1.0/user-data, ulteriorly descrypted with the easily-obtainable key)
  3. Security of data when stored in the AMI user-data and encrypted with not easily obtainable key
    • average
    • the encrypted data is accessible to any user who manages to log onto the AMI and has access to telnet, curl, wget, etc. (can access encrypted http://169.254.169.254/1.0/user-data)
      • an attempt to decrypt the encrypted data can then be made using brute-force attacks
  4. Security of data when stored on the AMI, in a secured location (no added value for it to be encrypted)
    • higher
    • the data is only accessible to one user, the user who requires the data in order to operate
      • e.g. file owned by user:user with mask 0600 or 0400
    • attacker must be able to impersonate the particular user in order to gain access to the data
      • additional security layers, such as denying the user direct log-on (having to pass through root for interactive impersonation) improves security

So any method involving the AMI user-data is not the most secure, because gaining access to any user on the machine (weakest point) compromises the data.

This could be mitigated if the S3 credentials were only required for a limited period of time (i.e. during the deployment process only), if AWS allowed you to overwrite or remove the contents of user-data when done with it (but this does not appear to be the case.) An alternative would be the creation of temporary S3 credentials for the duration of the deployment process, if possible (compromising these credentials, from user-data, after the deployment process is completed and the credentials have been invalidated with AWS, no longer poses a security threat.)

If the above is not applicable (e.g. S3 credentials needed by deployed nodes indefinitely) or not possible (e.g. cannot issue temporary S3 credentials for deployment only) then the best method remains to bite the bullet and scp the credentials to the various nodes, possibly in parallel, with the correct ownership and permissions.

Cheers, V.

vladr
I don't want to bundle the credentials in the instance since I want users of my AMI to use their own AWS credentials. Once an instance boots, it will connect to S3 buckets and SQS queues specified by the user who starts the instances. SCP works OK for a few machines but not 10 or 100. For loops? meh
AdamK
OK, so you are providing an AMI to an arbitrary number of users, where each user will have to provide its own AWS credentials in order to connect to (your? their?) S3 area?
vladr
Vlad, the 'parameterizing your AMI' article is almost how I am currently doing things now. Actually, http://blogs.sun.com/ec2/entry/using_parameterized_launches_to_customize is even closer. The problem is user-data in my case is sensitive (AWS creds) and the password must be contained in the AMI.
AdamK
You architecture still makes very little sense to me. Please edit the original article to clarify who is "I" in "I send the credentials", what is the password 'secrets' doing there and why it is so indispensable, etc.
vladr
My guess is that you wrote a script to run on a client machine to deploy N instances of your AMI by calling `ec2-run-instances -n N`, correct?
vladr
If this is the case, the you either loop over each instance afterwards to set it up, or you provide all necessary parameters to the AMI via -f and let them set themselves up autonomously based on those parameters. Up to you which one you choose, there's no third option really.
vladr
vladr
Another question is, these AWS credentials, are they only needed by the instance during AMI setup? If not, where and how will they continue to be stored on the instance?
vladr
Another option worth investigating is setting up a ram disk, sticking the keys onto it, and having the instances grab them from there.
vladr
Vlad, thank you for all the helpful comments. I updated my question. The gist of it is, "is my method secure?" and "is there a better approach?". I hadn't thought of using a RAM disk. The AWS credentials need to be stored only as long as instances are running and querying for jobs. Thanks again
AdamK
+1  A: 

I wrote an article examining various methods of passing secrets to an EC2 instance securely and the pros & cons of each.

http://www.shlomoswidler.com/2009/08/how-to-keep-your-aws-credentials-on-ec2.html

Shlomo Swidler