views:

280

answers:

10

In C# what's the best way to remove blank lines i.e., lines that contain only whitespace from a string? I'm happy to use a Regex if that's the best solution.

EDIT: I should add I'm using .NET 2.0.

+1  A: 
string corrected = 
    System.Text.RegularExpressions.Regex.Replace(input, @"\n+", "\n");
Adam Robinson
If the line contains whitespace chars to be removed you could change @"\n+" to @"\n\s?\n+"
Nick Gotch
+8  A: 

Using LINQ:

var result = string.Join("\r\n",
                 multilineString.Split(new string[] { "\r\n" }, ...None)
                                .Where(s => !string.IsNullOrWhitespace(s)));

If you're dealing with large inputs and/or inconsistent line endings you should use a StringReader and do the above old-school with a foreach loop instead.

dtb
there's no IsNullOrWhitespace method ;)
Thomas Levesque
@Thomas Levesque: orly? http://msdn.microsoft.com/en-us/library/system.string.isnullorwhitespace.aspx
dtb
my mistake... it's new in .NET 4.0, and I only have the local help for 3.5
Thomas Levesque
This doesn't produce a single string as a result (it produces an enumeration of non-empty lines). I'm not sure that really answers the question completely.
Michael Petito
@Michael Petito: note the `string.Join` in the first line which concatenates the enumeration of non-empty lines back together.
dtb
@Michael: `string.Join` produces a single string.
Adam Robinson
Ah indeed it is hidden up there. In that case you need a .ToArray() unless you're using .NET 4.0. In my opinion this is far less readable than a regex and I'm not sure what you'd really gain in this approach.
Michael Petito
BTW, the OP is using .NET 2.0, so no LINQ... (unless he's using VS2008 + LinqBridge)
Thomas Levesque
@Thomas Levesque: That's why I upvoted your answer :-) The requirement was added after I posted my answer.
dtb
But it's LINQ...must mark up to the top!
Chris S
When did LINQ become the new regex?
Dinah
I recently used Linq to defrost my freezer. Why do something the old way when Linq is so cool?
Ash
+1  A: 
char[] delimiters = new char[] { '\r', '\n' };
string[] lines = value.Split(delimiters, StringSplitOptions.RemoveEmptyEntries);
string result = string.Join(Environment.NewLine, lines)
Ben Hoffstein
+7  A: 

off the top of my head...

string fixed = Regex.Replace(input, "\s*(\n)","$1");

turns this:

fdasdf
asdf
[tabs]

[spaces]  

asdf


into this:

fdasdf
asdf
asdf
Sky Sanders
What?! no love for the elegant regex? I am crushed.
Sky Sanders
There are a few different ways to write this regex but I think the regex approach is most readable.
Michael Petito
+1. Elegant indeed. It will also remove tabs and spaces from the end of an otherwise non-blank line, but that's probably a good thing. You don't need the `Multiline` option, though.
Alan Moore
@Alan - you are right. It was a quick riff that satisfied the requirements. Thanks for the heads up.
Sky Sanders
+11  A: 
string outputString;
using (StringReader reader = new StringReader(originalString)
using (StringWriter writer = new StringWriter())
{
    string line;
    while((line = reader.ReadLine()) != null)
    {
        if (line.Trim().Length > 0)
            writer.WriteLine(line);
    }
    outputString = writer.ToString();
}
Thomas Levesque
+1 This one is nice since it should scale well for large strings.
Fredrik Mörk
Shouldn't this really be `if (line.Trim().Length > 0) writer.WriteLine(line)`? The OP did not request that all lines be trimmed in the output string.
Dan Tao
@Dan, good catch ! I fixed it
Thomas Levesque
+10  A: 

If you want to remove lines containing any whitespace (tabs, spaces), try:

string fix = Regex.Replace(original, @"^\s*$\n", string.Empty, RegexOptions.Multiline);
Chris Schmich
looks good to me.
Sky Sanders
`\s+` instead of `\s*` would be better I think
Salman A
@Salman Chris' rx is correct, as is my lonely, unappreciated answer. ;-(
Sky Sanders
@Salman A: `\s+` would not work on totally empty lines, e.g. `"foo\n\nbar"`.
Chris Schmich
A: 

Here's another option: use the StringReader class. Advantages: one pass over the string, creates no intermediate arrays.

public static string RemoveEmptyLines(this string text) {
    var builder = new StringBuilder();

    using (var reader = new StringReader(text)) {
        while (reader.Peek() != -1) {
            string line = reader.ReadLine();
            if (!string.IsNullOrWhiteSpace(line))
                builder.AppendLine(line);
        }
    }

    return builder.ToString();
}

Note: the IsNullOrWhiteSpace method is new in .NET 4.0. If you don't have that, it's trivial to write on your own:

public static bool IsNullOrWhiteSpace(string text) {
    return string.IsNullOrEmpty(text) || text.Trim().Length < 1;
}
Dan Tao
@Adam: Ha, wow, very stupid statement I made there. I meant no intermediate *arrays*, as the `string.Split` method would (thanks).
Dan Tao
A: 

I'll go with:

  public static string RemoveEmptyLines(string value) {
    using (StringReader reader = new StringReader(yourstring)) {
      StringBuilder builder = new StringBuilder();
      string line;
      while ((line = reader.ReadLine()) != null) {
        if (line.Trim().Length > 0)
          builder.AppendLine(line);
      }
      return builder.ToString();
    }
  }
Julien Lebosquain
A: 

Try this.

string s = "Test1" + Environment.NewLine + Environment.NewLine + "Test 2";
Console.WriteLine(s);

string result = s.Replace(Environment.NewLine, String.Empty);
Console.WriteLine(result);
dretzlaff17
This will completely not work.
SLaks
A: 
s = Regex.Replace(s, @"^[^\n\S]*\n", "");

[^\n\S] matches any character that's not a linefeed or a non-whitespace character--so, any whitespace character except \n. But most likely the only characters you have to worry about are space, tab and carriage return, so this should work too:

s = Regex.Replace(s, @"^[ \t\r]*\n", "");

And if you want it to catch the last line, without a final linefeed:

s = Regex.Replace(s, @"^[ \t\r]*\n?", "");
Alan Moore