tags:

views:

447

answers:

7

I recently got bit by a subtle bug.

char ** int2str = {
   "zero", // 0
   "one",  // 1
   "two"   // 2
   "three",// 3
   nullptr };

assert( values[1] == "one"_s ); // passes
assert( values[2] == "two"_s ); // fails

If you have godlike code review powers you'll notice I forgot the , after "two".

After the considerable effort to find that bug I've got to ask why would anyone ever want this behavior?

I can see how this might be useful for macro magic, but then why is this a "feature" in a modern language like python?

Have you ever used string literal concatenation in production code?

A: 

I certainly have in both C and C++. Offhand, I don't see much relationship between its utility and how "modern" the language is.

Jerry Coffin
I don't mean the older languages shouldn't be used modernly. I was referring to when the language spec was originally written. C# has definitely learned from Java and avoided it's warts. The same is true for Java/C# learning from C++.
caspin
+2  A: 

So that you can split long string literals across lines.

And yes, I've seen it in production code.

+5  A: 

Cases where this can be useful:

  • Generating strings including components defined by the preprocessor (this is perhaps the largest use case in C, and it's one I see very, very frequently).
  • Splitting string constants over multiple lines

To provide a more concrete example for the former:

// in version.h
#define MYPROG_NAME "FOO"
#define MYPROG_VERSION "0.1.2"

// in main.c
puts("Welcome to " MYPROG_NAME " version " MYPROG_VERSION ".");
Charles Duffy
+15  A: 

It's a great feature that allows you to combine preprocessor strings with your strings.

// Here we define the correct printf modifier for time_t
#ifdef TIME_T_LONG
    #define TIME_T_MOD "l"
#elif defined(TIME_T_LONG_LONG)
    #define TIME_T_MOD "ll"
#else
    #define TIME_T_MOD ""
#endif

// And he we merge the modifier into the rest of our format string
printf("time is %" TIME_T_MOD "u\n", time(0));
R Samuel Klatchko
+1, that's the best technical reason. The system defines several of those types of things as well - my answer has an example.
Carl Norum
The best technical reason, if you ignore the fact that you really shouldn't be using the preprocessor to do this sort of thing in the first place...
@STingRaySC, what about `PRIx32` or `PRIuLEAST32` and friends? http://www.opengroup.org/onlinepubs/9699919799/basedefs/inttypes.h.html
Carl Norum
@STingRaySC - while I agree that there are better way to do this in C++, his question is also tagged as C (where this is very useful).
R Samuel Klatchko
@Carl: Of course, if you're going to use those. But that doesn't mean that because the library author decided to use the preprocessor, it's a good decision. @R Samuel: In this case, it is not necessary even in C. There is no need for those string-literals to be compile-time constants.
@STingRaySC - yes, there is; look at how printf is implemented, particularly on tiny embedded targets.
Charles Duffy
@STingRaySC, what do you mean "library author"? It's part of the standard. Well, C99 anyway. Section 7.8.1.
Carl Norum
@Carl: Cripes... part of the standard **library**, no? Someone *authored* it, no?
There is a *very* good reason for `printf` (and friends') format strings to be compile-time constants - the compiler can tell you if your argument types don't match the format strings.
caf
@Charles: I don't understand your comment. `printf` requires compile-time constant arguments on some platforms? I doubt it.
@STingRaySC: it might not be necessary, but it's a common use. I'd like to see a pointer to a simple example of an alternative solution to this problem that doesn't use the preprocessor for comparison.
Michael Burr
@STingRaySC - no, it's not necessary, but it's much easier to write this way then to merge at runtime.
R Samuel Klatchko
@R Samuel: Much easier? `printf("time is %" + time_t_mod + "u\n", time(0));` -- you're right... that was tough!
@STingRaySC: 2 problems with what you suggested: 1) it won't work in C, and 2) you won't get compile time checking that some compilers provide (as mentioned by caf). Not to mention, the preprocessor version is really just as readable and maintainable - there's very little difference.
Michael Burr
@Michael: You're right. I forgot you can't do that in C. I concede. But, since the question is aimed at "modern" languages in general, I strongly disagree that this is a good answer, as it addresses an esoteric, archaic usage of string literal concatenation. I think the ability to split string literals across lines is the most relevant answer...
@STingRaySC - rather than being implemented as a single library call, printf can get optimized down to a series of calls specific to the formats included in that string -- which is why it takes only a constant string for its first argument! If you're compiling for a tiny embedded platform, not needing to have a do-it-all print-everything function with tons of code you'll never use linked in can be a huge win (and do remember that embedded space is one of the markets C still dominates, so there are lots of folks this is important to).
Charles Duffy
+16  A: 

Sure, it's the easy way to make your code look good:

char *someGlobalString = "very long "
                         "so broken "
                         "onto multiple "
                         "lines";

The best reason, though, is for weird printf formats, like type forcing:

uint64_t num = 5;
printf("Here is a number:  %"PRIX64", what do you think of that?", num);

There are a bunch of those defined, and they can come in handy if you have type size requirements. Check them all out at this link. A few examples:

PRIo8 PRIoLEAST16 PRIoFAST32 PRIoMAX PRIoPTR
Carl Norum
@Carl - you should modify your printf example to have a `uint64_t` argument.
R Samuel Klatchko
Thanks for that! Fixed.
Carl Norum
+2  A: 

I'm not sure about other programming languages, but for example C# doesn't allow you to do this (and I think this is a good thing). As far as I can tell, most of the examples that show why this is useful in C++ would still work if you could use some special operator for string concatenation:

string someGlobalString = "very long " +
                          "so broken " +
                          "onto multiple " +
                          "lines"; 

This may not be as comfortable, but it is certainly safer. In your motivating example, the code would be invalid unless you added either , to separate elements or + to concatenate strings...

Tomas Petricek
That would not be valid. At least one of those strings would have to have a cast to std::string before that would compile. Also, the question is tagged with C.
Billy ONeal
@BillyONeal: The question is tagged with Python/C++ and it asks why "modern languages such as Python" allow this, so I thought I would post one counter-example. And I wanted to show that you don't need the feature (in general) to support things like line-breaks and macro expansion.
Tomas Petricek
@Tomas Petricek: Hmm.. what was I smoking? +1
Billy ONeal
+3  A: 

From the python lexical analysis reference, section 2.4.2:

This feature can be used to reduce the number of backslashes needed, to split long strings conveniently across long lines, or even to add comments to parts of strings

http://docs.python.org/reference/lexical_analysis.html

Dan McCormick
Wouldn't a ''' string do the same?
caspin
@Caspin - a raw string (r'', or triple quoted) will include all newline characters and whitespace. Separate string literals will only be concatenated.
JimB