Our web app needs to be made PCI compliant, i.e. it must not store any credit card numbers. The app is a frontend to a mainframe system which handles the CC numbers internally and - as we have just found out - occasionally still spits out a full CC number on one of its response screens. By default, the whole content of these responses are logged at debug level, and also the content parsed from these can be logged in lots of different places. So I can't hunt down the source of such data leaks. I must make sure that CC numbers are masked in our log files.
The regex part is not an issue, I will reuse the regex we already use in several other places. However I just can't find any good source on how to alter a part of a log message with Log4J. Filters seem to be much more limited, only able to decide whether to log a particular event or not, but can't alter the content of the message. I also found the ESAPI security wrapper API for Log4J which at first sight promises to do what I want. However, apparently I would need to replace all the loggers in the code with the ESAPI logger class - a pain in the butt. I would prefer a more transparent solution.
Any idea how to mask out credit card numbers from Log4J output?
Update: Based on @pgras's original idea, here is a working solution:
public class CardNumberFilteringLayout extends PatternLayout {
private static final String MASK = "$1++++++++++++";
private static final Pattern PATTERN = Pattern.compile("([0-9]{4})([0-9]{9,15})");
@Override
public String format(LoggingEvent event) {
if (event.getMessage() instanceof String) {
String message = event.getRenderedMessage();
Matcher matcher = PATTERN.matcher(message);
if (matcher.find()) {
String maskedMessage = matcher.replaceAll(MASK);
@SuppressWarnings({ "ThrowableResultOfMethodCallIgnored" })
Throwable throwable = event.getThrowableInformation() != null ?
event.getThrowableInformation().getThrowable() : null;
LoggingEvent maskedEvent = new LoggingEvent(event.fqnOfCategoryClass,
Logger.getLogger(event.getLoggerName()), event.timeStamp,
event.getLevel(), maskedMessage, throwable);
return super.format(maskedEvent);
}
}
return super.format(event);
}
}
Notes:
- I mask with
+
rather than*
, because I want to tell apart cases when the CID was masked by this logger, from cases when it was done by the backend server, or whoever else - I use a simplistic regex because I am not worried about false positives
The code is unit tested so I am fairly convinced it works properly. Of course, if you spot any possibility to improve it, please let me know :-)