I'm developing a java app that exports data to CSV files, intended to be opened in Excel by end users. We just noticed that the export function uses Java's platform default encoding. This causes umlaut characters to be lost and unit test to fail on the build server (which is configured to have US-ASCII as its platform default encoding exactly to catch such potential problems).
The question is: which would be the best encoding to use? How does Excel determine what encoding to use? Does it use something platform-specific that presumably matches Java's platform default?
I'm currently leaning towards hardcoding Cp1252 - that should cover the target machines (the deployment environment is actually specified) and would fix the test problem. From googling around, Excel does not seem to handle UTF-8 well, so that's out, and sticking to the platform default encoding would require some sort of workaround hack for the tests.