ansaurus

Question

Hiding sensitive/confidential information in log files

Answer 1

+3 A:

I would personally regard the log files themselves as sensitive information and make sure to restrict access to them.

Fredrik Mörk 2009-09-23 15:56:51

True! I'm thinking of cases where you're a software vendor and asking your clients to send you the log files from their system in order to diagnose a system crash etc. Would the onus be on the client to first clean up their log files from sensitive information? Wouldn't it be nice if your system had a way to let clients get that for free?

Ates Goral 2009-09-23 16:01:29

"Restricting access" isn't specific enough to provide sufficient protection for credit card information. The logs need to be encrypted, and access to the decryption keys needs to be spelled out in the security policy.

erickson 2009-09-23 16:04:43

Answer 2

+1 A:

In your example, you should be encrypting the credit card number or, better yet, not even storing it in the first place.

If, say, you were logging something else, like a login, you might want to explicitly replace a password with *.

However, this manages to neatly avoid answering the question you've posed in the first place. In general, when dealing with sensitive information, it should be encrypted on its way to any form of permanent storage, be it a database file or a log file. Assume that a Bad Guy is going to be able to get their hands on either, and protect the information accordingly.

Bob Kaufman 2009-09-23 15:56:54

I think encryption can be the answer: as soon as sensitive info enters your system, it gets encrypted and lives as encrypted. So, if you're doing low level logging (semantics-agnostic) or even getting a memory dump, the information will be reasonably secure. I think I like the idea of encrypting the info instead of the entire log file as suggested in other answers.

Ates Goral 2009-09-25 05:52:33

Answer 3

+1 A:

If you know what you're trying to filter, you may run you log output through a Regex cleaning expression before you log it.

Esteban Araya 2009-09-23 15:57:24

Yes, I thought of that. In fact, this may be a viable solution since there will always be a discreet number of different types of "sensitive" strings which you can identify with regexes.

Ates Goral 2009-09-23 15:59:12

Answer 4

+1 A:

Logging a credit card number could be a PCI violation. And if you aren't PCI compliant, you will be charged higher card-processing fees. Either don't log sensitive information, or encrypt your entire log files.

Your idea of "tagging" sensitive information is intriguing. You could have a special data type for Sensitive information, that wrapped the real, underlying data type. Whenever this object is rendered as a character string, it just returns "***" or whatever.

However, this could require widespread coding changes, and requires a level of concious vigilance similar to that needed to avoid logging sensitive information in the first place.

erickson 2009-09-23 15:59:23

Answer 5

+1 A:

Regarding SQL statements specifically, if your language supports it, you should be using parameters instead of putting values in the statement itself. In other words:

select * from customers where credit_card = ?

Then set the parameter to the credit card number.

Of course, if you plan to log SQL statements with parameters filled in, you'd need some other way to filter out sensitive data.

Adam Crume 2009-09-23 16:03:28

True. But this only covers SQL statements.

Ates Goral 2009-09-25 05:54:23

That's why I prefaced it with "Regarding SQL statements specifically". It wasn't intended to be 100% general.

Adam Crume 2009-09-25 18:18:37

Noted. I was actually looking for more of a silver bullet solution instead of a case-by-case analysis, but thanks for this answer!

Ates Goral 2009-10-14 21:54:25

Answer 6

+2 A:

My current practice for the case in question is to log a hash of such sensitive information. This enables us to identify log records that belong to a specific claim (for example a specific credit-card number) but does not give anybody the power to just grab the logs and use the sensitive information for their evil purposes.

Of course, doing this consistently involves good coding practices. I usually choose to log all objects using their toString overloads (in Java or .NET) which serializes the hash of the values for fields marked with a Sensitive attribute applied to them.

Of course, SQL strings are more problematic, but we rely more on our ORM for data persistence and log the state of the system at various stages then log SQL queries, thus it is becomes a non-issue.

paracycle 2010-01-02 14:50:32

I like the idea of overriding toSource; that way, the logging code doesn't have to care about what's being logged. Although it doesn't address the issue with memory dumps, I'll accept this as the best answer since it's directly answering the original question. Thanks!

Ates Goral 2010-01-05 19:07:56

ansaurus

tags:

views:

answers:

Hiding sensitive/confidential information in log files

related questions