tags:

views:

1028

answers:

3

I decide write a program that list all of the files and directories, but have a problem when dealing with non-english filename.

The problem is that my program cannot gurantee those directories and filenames are in English, if some filenames using japanese, chinese character it will display some character like '?'.

J2SE provide variety of java.io.File list() functions http://java.sun.com/j2se/1.4.2/docs/api/java/io/File.html

But it seems do not deal with non-english filename.

Do anyone have the same problem? What direction should I look for the solution?

I search in google using keywords like "java list non-english filename", "java.io.file list non-english filename", but unfortunately I can't find the solution.

Hope people bring some thoughts to me, no matter searching keywords in google, or program directions.

Thanks~

A: 

Strings are Unicode (utf-16 as docs say) in Java. So just pass the file name as such. Of course the underlying OS should support this.

Xolve
A: 

How are you displaying the filename? Chances are the problem is in the display rather than fetching the string.

I suggest you print out the Unicode value of each character (use charAt() to get each character, then convert it to an int) and compare them to the Unicode code charts.

Jon Skeet
I use charAt(), then Character.getNumericValue(c) to display int value. It display the value -1. As the api said, "If the character does not have a numeric value, then -1 is returned".
thinksloth
By "convert it to an int", Jon means just assign it to int: char x = 'x'; int intValue = x;
McDowell
+1 to McDowell's comment.
Jon Skeet
+2  A: 

The problem with emitting "international" characters to the Windows command prompt from Java are threefold:

  1. The default raster font doesn't support it
  2. The default 8-bit code page dates back to the DOS days and isn't the same as the default Windows encoding on the system
  3. Java (System.out) encodes output in the default operating system encoding, which on Windows is going to be an inherently lossy process

To get Java to emit the characters, either:

  1. Install a MUI and switch to the settings that allow the characters you want (you might still need to use chcp to switch encodings)
  2. Switch the console to a Unicode TrueType font that includes the characters and use native methods (WriteConsoleW) to emit the text.

Links that explain it all:


You'll probably have better luck displaying the characters under Swing. You can use an app like this to test the fonts available to Swing to see if they render your characters:

public class FontTest {

  // a Cyrillic and two CJK characters    
  private final String filename = "\u044F\u4E10\u4E20.txt";

  private ComboBoxModel createModel() {
    GraphicsEnvironment genv = GraphicsEnvironment
        .getLocalGraphicsEnvironment();
    Vector<Font> fonts = new Vector<Font>();
    for (Font font : genv.getAllFonts()) {
      Font newFont = new Font(font.getFontName(), font
          .getStyle(), 12);
      fonts.add(newFont);
    }
    DefaultComboBoxModel model = new DefaultComboBoxModel(
        fonts);
    return model;
  }

  private JFrame createGui() {
    final JLabel label = new JLabel();
    label.setText(filename);

    final JComboBox combo = new JComboBox();
    combo.setEditable(false);
    combo.setModel(createModel());
    combo.addActionListener(new ActionListener() {
      @Override
      public void actionPerformed(ActionEvent e) {
        Font font = (Font) combo.getSelectedItem();
        label.setFont(font);
      }
    });

    label.setFont((Font) combo.getItemAt(0));

    JFrame frame = new JFrame();
    Container contentPane = frame.getContentPane();
    contentPane.setLayout(new GridLayout(0, 1));
    contentPane.add(label);
    contentPane.add(combo);
    frame.pack();
    frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);

    return frame;
  }

  public static void main(String[] args) {
    JFrame frame = new FontTest().createGui();
    frame.setVisible(true);
  }

}

Java 6 under XP displays all the characters perfectly using the default JLabel font (Dialog - which is a logical name mapping to something else, so you won't see it in charmap).

McDowell