tags:

views:

1058

answers:

6

I've posted to 2 forums already (CodeRanch and nabble) and no one has responded with an answer .... so stack overflow ... you are my last hope. All I'm trying to do is delete the "\n" from a file using an Ant task. There simply is a bunch of empty lines in a file and I don't want them there anymore ... here is the code I used ..

    <replaceregexp file="${outputFile}"
     match="^[ \t\n]+$"
     replace=""
     byline="true"/>

It's not picking up the regular expression and I've tried a hundred different ways and I can't figure it out. Any ideas?

A: 

From the documentation:

Similar to regexp type mappers this task needs a supporting regular expression library and an implementation of org.apache.tools.ant.util.regexp.Regexp

Do you have these?

ennuikiller
+1  A: 

I think your issue is with using the start-of-line/end-of-line characters. What you are looking for is blank lines, but what your regular expression appears to search for is a start-of-line followed by 1 or more new lines and/or tabs followed by end-of-line. Shouldn't it be start-of-line followed by end-of-line with nothing in between? So match="^$".

^ The above assumes the regex engine you're using is treating ^$ as start/end of line instead of start/end of the entire input string. If it's the entire input string, your regex would only match empty files, not files that contain some content but also some blank lines.

Brian Schroth
Yes I've tried what you said in the first paragraph and still nothing ... I'm using the regex tasks from the Ant build tool. I believe however that what you said in the second paragraph may be right ... that it's matching the start/end of the string which is why I'm not getting what I want. I've tried the regex a hundred different ways and was not sure if someone had a solution to this same problem.
Christopher Dancy
if that is indeed the problem, set the multiline flag (I believe it is flags="m" but check the docs at http://ant.apache.org/manual/OptionalTasks/replaceregexp.html .
Brian Schroth
Yes I've tried that as well and still nothing. However the docs do give the option for single line and multi-line and still they do not work for me.
Christopher Dancy
did you try with flags="m" AND match="^$"?
Brian Schroth
Yes I have and still nothing :(
Christopher Dancy
Perhaps verify that it's even doing the regex check that you think it is...use the most basic regex possible on a test data file that contains it and see if you can get that to work before attempting to get this specific regex to work. You probably also need the "g" flag as well to replace all matches instead of only one? Also, the "byline" property might be screwing it up (though I doubt it, just throwing out ideas)
Brian Schroth
A: 

The correct regex for a blank line is

^[ \t]*$\r?\n

You might have to escape the backslashes in the string, like so:

match="^[ \\t]*$\\r?\\n"
Tim Pietzcker
I tried this as well and still nothing. I believe this has to be a problem with the way Ant handles new-line characters. I've tried iterating over the contents of the file with a "\n" delimeter and still it gives me only one string ... while the file is clearly 10 lines long with only 3 of those having content. The code I used for this is as follows <loadfile property="lines" srcFile="${tempFile}"/> <ac:foreach list = "${lines}" delimiter = "\n" param = "rule_line" target = "outputLinesToFile" />
Christopher Dancy
+3  A: 

You may use FilterChain (specifically IgnoreBlank TokenFilter) to do precisely what you need:

<copy file="${input.file}" toFile="${output.file}">
  <filterchain>
    <ignoreblank/>
  </filterchain>
</copy>

ignoreblank will also remove lines consisting entirely of white space, but looking at your regular expression it seems that is what you want.

Alexander Pogrebnyak
Yes this does work! Thank you Alexander! My boss is still going to kick my @$$ for not having this done yesterday but at least it will work now. Thanks again.
Christopher Dancy
A: 

Try the following regular expression:

<replaceregexp file="${outputFile}"
    match="^\s*\n"
    replace=""
    byline="true" />

That should remove all empty lines and those containing only whitespace.

sirlancelot
A: 

Have you try the flags="g" ?

g - change globally

for reference http://ant.apache.org/manual/OptionalTasks/replaceregexp.html

James Kwan