ansaurus

Question

Should I use _T or _TEXT on C++ string literals?

Answer 1

+7 A:

I've never seen anyone use _TEXT() instead of _T().

egrunin 2010-01-15 20:34:17

Answer 2

+11 A:

Commit to Unicode and just use L"My String Literal".

2010-01-15 20:42:35

+1 for commit to Unicode; -1 for senseless dig at Microsoft.

Mark Ransom 2010-01-15 20:46:19

Converting to Unicode is not necessarily possible at the moment you're writing a piece of code. In the meantime, using _T() means nobody will have to change your code when it is converted.

David Thornley 2010-01-15 21:03:40

Nah, there is *never* a good reason to go back.

Hans Passant 2010-01-15 21:08:59

If you are committed to MS, then fabulous. Use all of their fancy macros and language extensions. Using `_T` means *somebody* will have to change your code when it is converted *to any other platform*. Only unsuccessful software is never ported...

2010-01-15 21:11:01

If you're writing _new_ code (even in an old codebase), there's absolutely no excuse to not always use `L""`. Another habit is to explicitly call wide functions, i.e. use `MessageBoxW` rather than `MessageBox` etc - it ensures that code will work the same regardless of how it's compiled. Remember that all MS OSes that didn't support Unicode APIs out of the box are already unsupported. Also, non-Unicode versions are less efficient on any NT OS. Don't forget MSLU / `UnicoWS.dll` for those cases when you need to support 9x, which lets you keep using Unicode even there.

Pavel Minaev 2010-01-15 21:13:01

There's no portability issue here. If he has functions defined via macro which expands either to `FuncA` or `FuncW` depending on `_UNICODE`, we're likely talking about Win32 API calls here already. I'm not aware of any stock C/C++ APIs that do anything like that.

Pavel Minaev 2010-01-15 21:14:19

I was going to say that he's already tied to Win32, which of course has to be addressed if the code is to be ported, but the text macros are an *ADDITIONAL* thing to deal with. Just because one aspect is not portable doesn't mean all bets are off.

2010-01-15 21:19:56

@STingRaySC: `_T` doesn't tie you to Microsoft at all. `#define _T(x) L##x` and you're done.

Mark Ransom 2010-01-15 21:26:23

If we were to port our current codebase to Linux, _T() would be one of the least of our worries. It's the work of a moment to write a replacement macro. Converting from MFC to something else would take far more time.

David Thornley 2010-01-15 21:27:39

@Mark: but it creates extra work for no benefit. Of course it could be defined, but getting that into every translation unit is not necessarily trivial. And then it just becomes cruft.

2010-01-15 21:29:07

@STingRaySC: You should have a porting header. Mark's right, but it should be TEXT not _TEXT, and define to _TEXT on MS, since it's a reserved name. Then on platforms which support Unicode but where the APIs you want to use take UTF-8, you don't need a conversion. If you commit to wide-char strings, your code is less easily portable for use with unknown future OS calls, not more.

Steve Jessop 2010-01-15 21:30:38

Nice sentiment, but doesn't answer the question.

jeffamaphone 2010-01-15 21:31:11

It *DOES* answer the question. No, you should not use _T or _TEXT on C++ string literals!

2010-01-15 21:35:39

Maybe I should ask another question about whether to support both `SomeMethodA` and `SomeMethodW` or strictly `SomeMethodW` (implying that L should be put in front of all strings)?

Arc 2010-01-15 21:37:29

If you are using Unicode (wide) string literals, as you should be, you will never need to use/support the ...A functions.

2010-01-15 21:39:38

-1: unicode != widechar.

Pavel Radzivilovsky 2010-01-15 22:48:29

@Pavel: thanks for the -1. I was trying not to open that can of worms. It's not relevant to the question. It should be a conscious decision to use either chars or wide chars for every string literal. And since MS defines Unicode as UTF-16, wide chars are a necessity.

2010-01-15 22:55:21

@Mark: No, you aren't done if you `#define _T(x) L##x`. The whole point of `TCHAR` is to deal with all the Windows functions being duplicated for ANSI and UTF-16 versions. Other platforms don't have that.

dan04 2010-07-17 19:28:16

@dan04, there may be many things that tie a program to Windows. My point is that `_T` isn't one of them, so there's no specific reason to avoid it.

Mark Ransom 2010-07-17 23:44:38

Answer 3

+5 A:

Here's an interesting read from a well-known and respected source.

Similarly, the _TEXT macro will map to L"..." instead of "...".

What about _T? Okay, I don't know about that one. Maybe it was just to save somebody some typing.

dirkgently 2010-01-15 20:43:23

You left out a key bit: _TEXT("hello") == L"hello" ONLY if UNICODE is defined.And yes, _T("hello") is just an abbreviation.

egrunin 2010-01-15 21:17:40

Both `UNICODE` and `_UNICODE` should always be defined unless you're planning a time travel to the nineties.

Philipp 2010-07-08 08:46:31

Answer 4

+4 A:

Neither. In my experience there are two basic types of string literals, those that are invariant, and those that need to be translated when your code is localized.

It's important to distinguish between the two as you write the code so you don't have to come back and figure out which is which later.

So I use _UT() for untranslatable strings, and ZZT() (or something else that is easy to search on) for strings that will need to be translated. Instances of _T() or _TEXT() in the code are evidence of string literals that have not yet be correctly categorized.

_UT and ZZT are both #defined to _TEXT

John Knoeller 2010-01-15 21:10:27

You should have a better system for localizing the strings than search/replace.

Mark Ransom 2010-01-15 21:28:27

I like that idea. I've had enough trouble figuring out which strings are displayed to the user (and hence should be translated) and which go to log files (which should not be).

David Thornley 2010-01-15 21:28:45

@Mark: What makes you think I don't?

John Knoeller 2010-01-15 21:29:05

I miss the days of ZZT... I digress.

Arc 2010-01-15 21:33:14

Answer 5

+8 A:

A simple grep of the SDK shows us that the answer is that it doesn't matter—they are the same. They both turn into __T(x).

C:\...\Visual Studio 8\VC>findstr /spin /c:"#define _T(" *.h 
crt\src\tchar.h:2439:#define _T(x)       __T(x) 
include\tchar.h:2390:#define _T(x)       __T(x)

C:\...\Visual Studio 8\VC>findstr /spin /c:"#define _TEXT(" *.h 
crt\src\tchar.h:2440:#define _TEXT(x)    __T(x) 
include\tchar.h:2391:#define _TEXT(x)    __T(x)

And for completeness:

C:\...\Visual Studio 8\VC>findstr /spin /c:"#define __T(" *.h 
crt\src\tchar.h:210:#define __T(x)     L ## x 
crt\src\tchar.h:889:#define __T(x)      x 
include\tchar.h:210:#define __T(x)     L ## x 
include\tchar.h:858:#define __T(x)      x

However, technically, for C++ you should be using TEXT() instead of _TEXT(), but it (eventually) expands to the same thing too.

jeffamaphone 2010-01-15 21:21:52

Interesting. Not only are they the same, they are defined in the same header. I had the mistaken assumption that _TEXT was Win32 and _T was ATL.

Max Lybbert 2010-01-15 21:28:06

@Max: They are also define, oddly, in MAPINls.h and MAPIWin.h, but I'm not sure why, and I excluded those from the results above for simplicity. But yeah, they're not ATL things they're Win32 things. It's possible the ATL documentation uses it because they liked it better? ATL is all about saving you typing right? :)

jeffamaphone 2010-01-15 21:29:58

Answer 6

+3 A:

From Raymond Chen:

TEXT vs. _TEXT vs. _T, and UNICODE vs. _UNICODE

The plain versions without the underscore affect the character set the Windows header files treat as default. So if you define UNICODE, then GetWindowText will map to GetWindowTextW instead of GetWindowTextA, for example. Similarly, the TEXT macro will map to L"..." instead of "...".

The versions with the underscore affect the character set the C runtime header files treat as default. So if you define _UNICODE, then _tcslen will map to wcslen instead of strlen, for example. Similarly, the _TEXT macro will map to L"..." instead of "...".

What about _T? Okay, I don't know about that one. Maybe it was just to save somebody some typing.

Short version: _T() is a lazy man's _TEXT()

Note: You need to be aware of what code-page your source code text editor is using when you write:

_TEXT("Some string containing Çontaining");
TEXT("€xtended characters.");

The bytes the compiler sees depends on the code page of your editor.

Ian Boyd 2010-01-15 21:23:29

Answer 7

+2 A:

These macros are a hold over from the days when an application might have actually wanted to compile both a unicode and ANSI version.

There is no reason to do this today - this is all vestigial. Microsoft is stuck with supporting every possible configuration forever, but you aren't. If you are not compiling to both ANSI and Unicode (and no one is, let's be honest) just go to with L"text".

And yes, in case it wasn't clear by now: _T == _TEXT

Terry Mahaffey 2010-01-15 22:24:33

Answer 8

A:

Use neither, and also please don't use the L"..." crap. Use UTF-8 for all strings, and convert them just before passing to microsoft APIs.

Pavel Radzivilovsky 2010-01-15 22:46:57

The worst possible option on Windows, will require millions of unnecessary string conversions. Always use UTF-16.

Philipp 2010-07-08 08:47:38

Disagree categorically. See, for example, http://stackoverflow.com/questions/1049947/should-utf-16-be-considered-harmful why UTF-16 is bad on windows.

Pavel Radzivilovsky 2010-07-08 09:04:27

ansaurus

tags:

views:

answers:

Should I use _T or _TEXT on C++ string literals?

related questions