I'm splitting a string by three different characters but I want the output to include the characters I split by. Is there any easy way to do this?
result = originalString.Split(separator);
for(int i = 0; i < result.Length - 1; i++)
result[i] += separator;
(EDIT - this is a bad answer - I misread his question and didn't see that he was splitting by multiple characters.)
(EDIT - a correct LINQ version is awkward, since the separator shouldn't get concatenated onto the final string in the split array.)
Recently I wrote an extension method do to this:
public static class StringExtensions
{
public static IEnumerable<string> SplitAndKeep(this string s, string seperator)
{
string[] obj = s.Split(new string[] { seperator }, StringSplitOptions.None);
for (int i = 0; i < obj.Length; i++)
{
string result = i == obj.Length - 1 ? obj[i] : obj[i] + seperator;
yield return result;
}
}
}
Regex.Split looks like it might be able to do what you want perhaps.
I'd try:
string[] parts = Regex.Split(originalString, @"(?<=[.,;])")
(if the split chars were , . and ;)
(?<=PATTERN) is positive-lookbehind. It should match at any place where the preceeding text fits PATTERN so there should be a match (and a split) after each occurance of any the characters.
using System.Collections.Generic;
using System.Text.RegularExpressions;
namespace ConsoleApplication9
{
class Program
{
static void Main(string[] args)
{
string input = @"This;is:a.test";
char sep0 = ';', sep1 = ':', sep2 = '.';
string pattern = string.Format("[{0}{1}{2}]|[^{0}{1}{2}]+", sep0, sep1, sep2);
Regex regex = new Regex(pattern);
MatchCollection matches = regex.Matches(input);
List<string> parts=new List<string>();
foreach (Match match in matches)
{
parts.Add(match.ToString());
}
}
}
}
Iterate through the string character by character (which is what regex does anyway. When you find a splitter, then spin off a substring.
pseudo code
int hold, counter;
List<String> afterSplit;
string toSplit
for(hold = 0, counter = 0; counter < toSplit.Length; counter++)
{
if(toSplit[counter] = /*split charaters*/)
{
afterSplit.Add(toSplit.Substring(hold, counter));
hold = counter;
}
}
That's sort of C# but not really. Obviously, choose the appropriate function names. Also, I think there might be an off-by-1 error in there.
But that will do what you're asking.
This seems to work, but its not been tested much.
public static string[] SplitAndKeepSeparators(string value, char[] separators, StringSplitOptions splitOptions)
{
List<string> splitValues = new List<string>();
int itemStart = 0;
for (int pos = 0; pos < value.Length; pos++)
{
for (int sepIndex = 0; sepIndex < separators.Length; sepIndex++)
{
if (separators[sepIndex] == value[pos])
{
// add the section of string before the separator
// (unless its empty and we are discarding empty sections)
if (itemStart != pos || splitOptions == StringSplitOptions.None)
{
splitValues.Add(value.Substring(itemStart, pos - itemStart));
}
itemStart = pos + 1;
// add the separator
splitValues.Add(separators[sepIndex].ToString());
break;
}
}
}
// add anything after the final separator
// (unless its empty and we are discarding empty sections)
if (itemStart != value.Length || splitOptions == StringSplitOptions.None)
{
splitValues.Add(value.Substring(itemStart, value.Length - itemStart));
}
return splitValues.ToArray();
}
Building off from BFree's answer, I had the same goal, but I wanted to split on an array of characters similar to the original Split method, and I also have multiple splits per string (it seems that BFree only has 1 split per string?) Here is the code I came up with:
public static IEnumerable<string> SplitAndKeep(this string s, char[] delims)
{
int start = 0;
int index = 0;
while ((index = s.IndexOfAny(delims, start)) != -1)
{
index++;
index = Interlocked.Exchange(ref start, index);
yield return s.Substring(index, start-index-1);
yield return s.Substring(start-1, 1);
}
if (start < s.Length)
{
yield return s.Substring(start);
}
}
public static class String_Ext
{
public static string[] SplitOnGroups(this string str, string pattern)
{
var matches = Regex.Matches(str, pattern);
var partsList = new List<string>();
for (var i = 0; i < matches.Count; i++)
{
var groups = matches[i].Groups;
for (var j = 0; j < groups.Count; j++)
{
var group = groups[j];
partsList.Add(group.Value);
}
}
return partsList.ToArray();
}
}
var parts = "abcde \tfgh\tikj\r\nlmno".SplitOnGroups(@"\s+|\S+");
for (var i = 0; i < parts.Length; i++)
Print(i + "|" + Translate(parts[i]) + "|");
result:
0|abcde|
1| \t|
2|fgh|
3|\t|
4|ikj|
5|\r\n|
6|lmno|