views:

7290

answers:

4

If I run my C++ application with the following main() method everything is OK:

int main(int argc, char *argv[]) 
{
   cout << "There are " << argc << " arguments:" << endl;

   // Loop through each argument and print its number and value
   for (int i=0; i<argc; i++)
      cout << i << " " << argv[i] << endl;

   return 0;
}

I get what I expect and my arguments are printed out.

However, if I use _tmain:

int _tmain(int argc, char *argv[]) 
{
   cout << "There are " << argc << " arguments:" << endl;

   // Loop through each argument and print its number and value
   for (int i=0; i<argc; i++)
      cout << i << " " << argv[i] << endl;

   return 0;
}

It just displays the first character of each argument.

What is the difference causing this?

+4  A: 

the _T convention is used to indicate the program should use the character set defined for the application (Unicode, ASCII, MBCS, etc.). You can surround your strings with _T( ) to have them stored in the correct format.

 cout << _T( "There are " ) << argc << _T( " arguments:" ) << endl;
Paul Alexander
In fact, MS recommends this approach, afaik. Making your application unicode-aware, they call it... using the _t version of all the string manipulation functions, too.
Deep-B
@Deep-B : And on Windows, this **is** how you make your application unicode-ready (I prefer the term of unicode-ready to -aware), if it was based on `char` s before. If your application directly uses `wchar_t` then your application **is** unicode.
paercebal
@Paul Alexander : By the way, if you try to compile on UNICODE, then your code won't compile as your outputing wchar_t inside a char-based cout, where it should have been wcout. See Michael J's answer for an exemple of defining a "tcout"...
paercebal
+35  A: 

_tmain does not exist in C++. main does.

_tmain is a Microsoft extension.

main is, according to the C++ standard, the program's entry point. It has one of these two signatures:

int main();
int main(int argc, char* argv[]);

Microsoft has added a wmain which replaces the second signature with this:

int wmain(int argc, wchar_t* argv[]);

And then, to make it easier to switch between Unicode (UTF-16) and their multibyte character set, they've defined _tmain which, if Unicode is enabled, is compiled as wmain, and otherwise as main.

As for the second part of your question, the first part of the puzzle is that your main function is wrong. wmain should take a wchar_t argument, not char. Since the compiler doesn't enforce this for the main function, you get a program where an array of wchar_t strings are passed to the main function, which interprets them as char strings.

Now, in UTF-16, the character set used by Windows when Unicode is enabled, all the ASCII characters are represented as the pair of bytes \0 followed by the ASCII value.

And since the x86 CPU is little-endian, the order of these bytes are swapped, so that the ASCII value comes first, then followed by a null byte.

And in a char string, how is the string usually terminated? Yep, by a null byte. So your program sees a bunch of strings, each one byte long.

In general, you have three options when doing Windows programming:

  • Explicitly use Unicode (call wmain, and for every Windows API function which takes char-related arguments, call the -W version of the function. Instead of CreateWindow, call CreateWindowW). And instead of using char use wchar_t, and so on
  • Explicitly disable Unicode. Call main, and CreateWindowA, and use char for strings.
  • Allow both. (call _tmain, and CreateWindow, which resolve to main/_tmain and CreateWindowA/CreateWindowW), and use TCHAR instead of char/wchar_t.

The same applies to the string types defined by windows.h: LPCTSTR resolves to either LPCSTR or LPCWSTR, and for every other type that includes char or wchar_t, a -T- version always exists which can be used instead.

Note that all of this is Microsoft specific. TCHAR is not a standard C++ type, it is a macro defined in windows.h. wmain and _tmain are also defined by Microsoft only.

jalf
i wonder whether they provide a tcout too? so that one could just do tcout << argv[n]; and it resolves to cout in Ansi and wcout in Unicode mode? I suspect that could be useful for him in this situation. and +1 of course, nice answer :)
Johannes Schaub - litb
What disadvantage would disabling UNICODE provide?
joshcomley
@Johannes Schaub - litb : AFAIK, they don't provide a tcout (even if they provide cout and wcout). On Visual C++2003, I had to define one, as well as define and/or typedef all other STL-related symbols I wanted to use a TCHAR. I don't know on Visual C++2008 or 2010, though.
paercebal
+6  A: 

_tmain is a macro that gets redefined depending on whether or not you compile with Unicode. It is a Microsoft extension and won't work on any other compilers.

The correct declaration is

 int _tmain(int argc, TCHAR *argv[])

If the macro UNICODE is defined, that expands to

int wmain(int argc, wchar_t *argv[])

Otherwise it expands to

int main(int argc, char *argv[])

Your definition goes for a bit of each, and (if you have UNICODE defined) will expand to

 int wmain(int argc, char *argv[])

which is just plain wrong.

std::cout works with ASCII characters. You need std::wcout if you are using wide characters.

try something like this

#include <iostream>
#include <tchar.h>

#if defined(UNICODE)
    #define _tcout std::wcout
#else
    #define _tcout std::cout
#endif

int _tmain(int argc, TCHAR *argv[]) 
{
   _tcout << _T("There are ") << argc << _T(" arguments:") << std::endl;

   // Loop through each argument and print its number and value
   for (int i=0; i<argc; i++)
      _tcout << i << _T(" ") << argv[i] << std::endl;

   return 0;
}

Or you could just decide in advance whether to use wide or narrow characters. :-)

Michael J
+1 `_tmain` uses `TCHAR` not `char`.
sixlettervariables
A: 
for (int i=0; i<argc; i++)
  _tcout << i << _T(" ") << argv[i] << std::endl;

why don't we write:

for (int i=0; i<argc; i++)
  _tcout << i << " " << argv[i] << std::endl;

I don't understand 2 parts in the above command line: _T(" ") and std::endl

In C++, I often write " " and endl.

But in this case, if I write this, compiler notices errors. Tell me what is wrong? Thanks

Lam