They're called "Res_SChinese.java" and
"Res_TChinese.java"
I assume that these must be Java class files, though I am surprised that they are in different encodings.
Having source files in multiple encodings is highly undesirable. If you don't know what character set a source file has, you can use the ICU project libraries to help you guess:
public static void main(String[] args) throws IOException {
InputStream file = new FileInputStream(args[0]);
try {
file = new BufferedInputStream(file);
CharsetDetector detector = new CharsetDetector();
detector.setText(file);
String tableTemplate = "%10s %10s %8s%n";
System.out.format(tableTemplate, "CONFIDENCE",
"CHARSET", "LANGUAGE");
for (CharsetMatch match : detector.detectAll()) {
System.out.format(tableTemplate, match
.getConfidence(), match.getName(), match
.getLanguage());
}
} finally {
file.close();
}
}
Note that the number of Chinese character encodings it can detect is limited (ISO-2022-CN, GB18030 and Big5), but at least it might help you find out if everything is just encoded in a Unicode transformation format or something.
Eclipse (JBuilder is Eclipse-based now, isn't it?) can set encodings for individual files. You can set the encoding Eclipse uses for a file by right-clicking it and selecting Properties. The encoding is under the Resource properties. this is difficult to manage and won't apply to any external tools you use (like an Ant build script).
It is possible to compile files using a different encoding using external. For example:
javac -encoding GB18030 Foo.java
But if these classes have interdependencies, that is going to get painful fast.
Faced with multiple encodings, I would translate all the files to a single encoding. There are a couple of options here.
Use a Latin-1 subset
Java supports Unicode escape sequences in source files. So, the Unicode character U+6874 桴 can be written as the literal \u6874. The JDK tool native2ascii can be used to transform your Java files to Latin-1 values.
native2ascii -encoding GB2312 FooIn.java FooOut.java
The resultant files will probably compile anywhere without problem, but might be a nightmare for anyone reading/editing the files.
Use GB18030
GB18030 is a huge character set, so if this is your native encoding, it might be an idea to use that (otherwise, if I was going this route, I'd use UTF-8).
You can use code like this to perform the transformation:
public static void main(String[] args) throws IOException {
changeEncoding("in_cn.txt", Charset.forName("GBK"),
"out_cn.txt", Charset.forName("GB18030"));
}
private static void changeEncoding(String inFile,
Charset inCharset, String outFile, Charset outCharset)
throws IOException {
InputStream in = new FileInputStream(inFile);
Reader reader = new InputStreamReader(in, inCharset);
OutputStream out = new FileOutputStream(outFile);
Writer writer = new OutputStreamWriter(out, outCharset);
copy(reader, writer);
writer.close();
reader.close();
// TODO: try/finally blocks; proper stream handling
}
private static void copy(Reader reader, Writer writer)
throws IOException {
char[] cbuf = new char[1024];
while (true) {
int r = reader.read(cbuf);
if (r < 0) { break; }
writer.write(cbuf, 0, r);
}
}
If I open them in Notepad, i can view them both properly even with just the locale set to Chinese (PRC)
Notepad uses a heuristic character encoding detection mechanism. It doesn't always work.