views:

1160

answers:

4

I want to use the Split function on a string but keep the delimiting sequence as the first characters in each element of the string array. I am using this function to split HTML on every instance of a URL so I can run regex patterns on the URLs on a website. Is there any overloads of the split function to do this? or do I have to write my own function?

Thanks!

+3  A: 

There is no built-in method for doing that. If you are splitting on a single pattern though, this can be coded out with the following

public IEnumerable<string> SplitAndKeepPrefix(this string source, string delimeter) {
  return SplitAndKeepPrefix(source, delimeter, StringSplitOptions.None);
}

public IEnumerable<string> SplitAndKeepPrefix(this string source, string delimeter, StringSplitOptions options ) {
  var split = source.Split(delimeter, options);
  return split.Take(1).Concat(split.Skip(1).Select(x => delimeter + x));
}

string result = htmlStr.SplitAndKeepPrefix("<a");

EDIT

Updated to not prefix every string :)

JaredPar
It doesn't look like you're handling the first element properly. The string "Check out <a href='http://example.com'>this link</a>!" would result in elements "<a Check out " and "<a href='http://example.com'>this link</a>!", right?
Matthew Maravillas
also, it wouldn't work if you have more then one possible delimiter.
Meidan Alon
@Meidan, I explicitly called out that this is a single pattern anwser
JaredPar
@Mathew yes it would :)
JaredPar
+3  A: 
 public static string[] SplitAndKeepDelimiters(this string Original, string[] Delimeters, StringSplitOptions Options)
 {
  var strings = EnumSplitAndKeepDelimiters(Original, Delimeters);

  if (Options == StringSplitOptions.RemoveEmptyEntries)
  {
   return strings.Where((s) => s.Length != 0).ToArray();
  }
  else
  {
   return strings.ToArray();
  }
 }

 static IEnumerable<string> EnumSplitAndKeepDelimiters(this string Original, string[] Delimeters)
 {
  int currIndex = 0;

  while (currIndex < Original.Length)
  {
   var delimiterIndex = Delimeters.Select((d) => new { Source = d, Index = Original.IndexOf(d, currIndex) })
    .Where((d) => (d.Index != -1) && (d.Source != string.Empty) )
    .OrderBy((d) => d.Index)
    .FirstOrDefault();
        int nextIndex = (delimiterIndex != null ) ? delimiterIndex.Index + delimiterIndex.Source.Length : Original.Length;
   yield return Original.Substring(currIndex, nextIndex - currIndex);
   currIndex = nextIndex;
  }
 }
Meidan Alon
A: 

As far as I know this is not possible with the default Split method. You could write an extension method to solve your problem. Or simply iterate through the string [] and place the delimiter in front of each string.

I would go for the extension method :)

Henk
A: 

The answer is no you'll have to roll your own version.

Information on the String.Split API can be found on MSDN

Gavin Miller