tags:

views:

183

answers:

3

The following code somehow fails to notice any files with non-ASCII characters in their names (Cyrillic characters, specifically):

for (int path = 1; path < argc; path++) {
  QFileInfo fi(argv[path]);
  if (fi.isDir()) {
    QDir dir(argv[path], "", QDir::LocaleAware, QDir::AllEntries);
    qDebug() << dir.entryList();
    QDirIterator it(QString(argv[path]), QDirIterator::Subdirectories);
    while (it.hasNext()) {
      it.next();
      qDebug() << it.fileInfo().absoluteFilePath();
      /* Processing; irrelevant in the context of the question */
    }
  }
}

What exactly am I doing wrong here? How should I handle QDir and QDirIterator to make them aware of Cyrillic filenames?

The system locale is en_US.UTF-8.

Update: On Windows, everything works correctly.

+1  A: 

Which part is failing? Reading the initial directory specified argv[path] or the iterator? If it's the former, you should convert byte strings to QString for file processing using QFile::decodeName. The default char* => QString conversion uses Latin-1, which is not what you want for file names.

Lukáš Lalinský
Both fail. Actually, I only wrote the part with QDir for testing purposes when I saw that QDirIterator (which I actually need in my program) doesn't work.
David Parunakian
A: 

Don't use argv[path] just like that when constructing the QStrings. This will treat the string as a latin1 string (which doesn't care about cyrillic characters). Try using

const QString dirName = QString::fromLocal8Bit( argv[path] );

at the top of your loop and then use dirName everywhere instead of argv[path].

Frerich Raabe
Predictably, it didn't help, because although the directory name is now handled correctly, the names of its entries still aren't, and I haven't yet found a method to fix that.
David Parunakian
@David: What does 'the name of its entries are not handled correctly' mean? In your example, you don't handle the directory entries at all.
Frerich Raabe
I do. Constructing a QDirIterator and iterating over it with next() returns names of (supposedly) all files contained in it, including names of files contained in its subdirectories. Well, for some reason it only iterates over files whose names can be represented in latin1, and that is my main problem.
David Parunakian
+1  A: 

Get cmd line parameters out of QApplication itself.

So

QApplication app(argc, argv);

QStringList args = app.arguments();

for(...)

Qt will handle encoding properly. But that will only fix problems with unicode on cmd line. Not sure if that is your main problem though.

EDIT: fromLocal8Bit() probably doesn't work because it wasn't local encoding, but utf8. So fromUtf8() would work on linux and osx (but it won't work on windows). On *nuxes it depends on some environment variables (LS_LANG or something). I guess Qt takes everything into account and converts it properly. You can look at the constructor code for QApplication if you want to know exactly what they do.

Eugene
Great, it works. :-) However, I do not understand how or why this method works, while QString::fromLocal8Bit(argv[i]) doesn't.
David Parunakian