tags:

views:

390

answers:

3

I want multi-line strings in java, so I seek a simple preprocessor to convert C-style multi-lines into single lines with a literal '\n'.

Before:

    System.out.println("convert trailing backslashes\
this is on another line\
\
\
above are two blank lines\
But don't convert non-trailing backslashes, like: \"\t\" and \'\\\'");

After:

     System.out.println("convert trailing backslashes\nthis is on another line\n\n\nabove are two blank lines\nBut don't convert non-trailing backslashes, like: \"\t\" and \'\\\'");

I thought sed would do it well, but sed is line-based, so replacing the '\' and the newline that follows it (effectively joining the two lines) is not very natural in sed. I adapted sredden79's oneliner to the following - it works, it's clever, but it's not clear:

sed ':a { $!N; s/\\\n/\\n/; ta }'

The substitute is of escaped literal backslash, newline with escaped literal backslash, n. :a is a label and ta is goto label if the substitute found a match; $ means the last line, and $! is the opposite (i.e. all lines but the last). N means to append the next line to the pattern space (thus making the \n character visible.)

EDIT here's a variation to keep compiler error line numbers etc accurate: it turns each extended line into "..."+\n (and handles the first and last lines of the String correctly):

sed ':a { $!N; s/\\\n/\\n"+\n"/; ta }'

giving:

    System.out.println("convert trailing backslashes\n"+
"this is on another line\n"+
"\n"+
"\n"+
"above are two blank lines\n"+
"But don't convert non-trailing backslashes, like: \"\t\" and \'\\\'");

EDIT Actually, it would be better have Perl/Python style multi-line, where it starts and ends with a special code on one line (""" for python, I think).

Is there a simpler, saner, clearer way (maybe not using sed)?

+4  A: 

Is there a simpler, saner, clearer way.

Forget the pre-processor, live with the limitation, complain about it (so that it will maybe be fixed in Java 7 or 8), and use an IDE to ease the pain.

Other alternatives (too troublesome I suppose, but still better than messing with the compilation process):

  • use a JVM-based language that does support here-docs
  • externalize the string into a resource file
Thilo
+1 use the IDE/emacs
Lachlan Roche
Thanks 1. IDE idea is cool, but doesn't help with editing (it's a pain to edit, add, move between lines of multi-line concatenated strings - that's what I used to use). 2. externalize into a resource file is what I do now - but I think it's simpler and more manageable to have it together with the source code to which it relates. 3. a whole new JVM language just to solve this does seem troublesome, but... it would already be debugged etc and have tool support, syntax highlighting etc, so your idea has an intriguing elegance! Of course, one can view the sed script as a "JVM language" itself.
13ren
+1  A: 

A perl script to what you asked for.

while (<>) {
    chomp;
    print $_;
    if (/\\$/) {
        print "n";
    } else {
        print "\n";
    }
}
Lachlan Roche
+3  A: 

A perl one-liner:

perl -0777 -pe 's/\\\n/\\n/g'

This will read either stdin or the file(s) named after it on the command line and write the output to stdout.

If you're using an editor that supports filtering, like vi or emacs, just filter your text through the above command and you're done:

If you're using Windows and have to worry about \r :

C:\> perl -0777 -pe "s/\\\r?\n/\\n/g"

although I think win32 Perl handles \r itself so this may be unnecessary.

The -0777 option is a special case of the -0 (that's a zero) option that defines the line or record separator. In this case, it means that we don't want any separator so read the entire file in as a single string.

The -pe option is a combination of -p (process line-by-line and print the result) and -e (next argument is (a line of) the program to execute)

Adrian Pronk
13ren
@13ren: You're right, the /x modifier is not needed in this case. I had it there from my testing. I've edited the answer and removed it.
Adrian Pronk