views:

4477

answers:

8

I'm new to Windows programming and after reading the Petzold book I wonder:

is it still good practice to use the TCHAR type and the _T() function to declare strings or if I should just use the wchar_t and L"" strings in new code?

I will target only Windows 2000 and up and my code will be i18n from the start up.

+2  A: 

Yes, absolutely; at least for the _T macro. I'm not so sure about the wide-character stuff, though.

The reason being is to better support WinCE or other non-standard Windows platforms. If you're 100% certain that your code will remain on NT, then you can probably just use regular C-string declarations. However, it's best to tend towards the more flexible approach, as it's much easier to #define that macro away on a non-windows platform in comparison to going through thousands of lines of code and adding it everywhere in case you need to port some library to windows mobile.

Nik Reiman
WinCE uses 16-bit wchar_t strings just like Win32. We have a large base of code that runs on WinCE and Win32 and we never use TCHAR.
mhenry1384
+12  A: 

If you're wondering if it's still in practice, then yes - it is still used quite a bit. No one will look at your code funny if it uses TCHAR and _T(""). The project I'm working on now is converting from ANSI to unicode - and we're going the portable (TCHAR) route.

However...

My vote would be to forget all the ANSI/UNICODE portable macros (TCHAR, _T(""), and all the _tXXXXXX calls, etc...) and just assume unicode everywhere. I really don't see the point of being portable if you'll never need an ANSI version. I would use all the wide character functions and types directly. Preprend all string literals with a L.

Aardvark
You might write some code you'll want to use somewhere else where you do need an ANSI version, or (as Nick said) Windows might move to DCHAR or whatever, so I still think it's a very good idea to go with TCHAR instead of WCHAR.
arke
+11  A: 

I would still use the TCHAR syntax if I was doing a new project today. There's not much practical difference between using it and the WCHAR syntax, and I prefer code which is explicit in what the character type is. Since most API functions and helper objects take/use TCHAR types (e.g.: CString), it just makes sense to use it. Plus it gives you flexibility if you decide to use the code in an ASCII app at some point, or if Windows ever evolves to Unicode32, etc.

If you decide to go the WCHAR route, I would be explicit about it. That is, use CStringW instead of CString, and casting macros when converting to TCHAR (eg: CW2CT).

That's my opinion, anyway.

Nick
+18  A: 

The short answer: NO.

Like all the others already wrote, a lot of programmers still use TCHARs and the corresponding functions. In my humble opinion the whole concept was a bad idea. UTF-16 string processing is a lot different than simple ASCII/MBCS string processing. If you use the same algorithms/functions with both of them (this is what the TCHAR idea is based on!), you get very bad performance on the UTF-16 version if you are doing a little bit more than simple string concatenation (like parsing etc.). The main reason are Surrogates.

With the sole exception when you really have to compile your application for a system which doesn't support Unicode I see no reason to use this baggage from the past in a new application.

Sascha
+5  A: 

I have to agree with Sascha. The underlying premise of TCHAR / _T() / etc. is that you can write an "ANSI"-based application and then magically give it Unicode support by defining a macro. But this is based on several bad assumptions:

That you actively build both MBCS and Unicode versions of your software

Otherwise, you will slip up and use ordinary char* strings in many places.

That you don't use non-ASCII backslash escapes in _T("...") literals

Unless your "ANSI" encoding happens to be ISO-8859-1, the resulting char* and wchar_t* literals won't represent the same characters.

That UTF-16 strings are used just like "ANSI" strings

They're not. Unicode introduces several concepts that don't exist in most legacy character encodings. Surrogates. Combining characters. Normalization. Conditional and language-sensitive casing rules.

And perhaps most importantly, the fact that UTF-16 is rarely saved on disk or sent over the Internet: UTF-8 tends to be preferred for external representation.

That your application doesn't use the Internet

(Now, this may be a valid assumption for your software, but...)

The web runs on UTF-8 and a plethora of rarer encodings. The TCHAR concept only recognizes two: "ANSI" (which can't be UTF-8) and "Unicode" (UTF-16). It may be useful for making your Windows API calls Unicode-aware, but it's damned useless for making your web and e-mail apps Unicode-aware.

That you use no non-Microsoft libraries

Nobody else uses TCHAR. Poco uses std::string and UTF-8. SQLite has UTF-8 and UTF-16 versions of its API, but no TCHAR. TCHAR isn't even in the standard library, so no std::tcout unless you want to define it yourself.

What I recommend instead of TCHAR

Forget that "ANSI" encodings exist, except for when you need to read a file that isn't valid UTF-8. Forget about TCHAR too. Always call the "W" version of Windows API functions. #define _UNICODE just to make sure you don't accidentally call an "A" function.

Always use UTF encodings for strings: UTF-8 for char strings and UTF-16 (on Windows) or UTF-32 (on Unix-like systems) for wchar_t strings. typedef UTF16 and UTF32 character types to avoid platform differences.

dan04
+2  A: 

Just adding to an old question:

NO

Go start a new CLR C++ project in VS2010. Microsoft themselves use L"Hello World", 'nuff said.

kizzx2
+1 : that is an argument :-)
Stephane Rolland
+2  A: 

The Introduction to Windows Programming article on MSDN says

New applications should always call the Unicode versions (of the API).

The TEXT and TCHAR macros are less useful today, because all applications should use Unicode.

I would stick to wchar_t and L"".

Steven
A: 

IMHO, if there's TCHARs in your code, you're working at the wrong level of abstraction.

Use whatever string type is most convenient for you when dealing with text processing - this will hopefully be something supporting unicode, but that's up to you. Do conversion at OS API boundaries as necessary.

When dealing with file paths, whip up your own custom type instead of using strings. This will allow you OS-independent path separators, will give you an easier interface to code against than manual string concatenation and splitting, and will be a lot easier to adapt to different OSes (ansi, ucs-2, utf-8, whatever).

snemarch