Is there any built-in method to remove similar characters in a string?
Examples:
aaaabbbccc -> abc
aabbccaa -> abc
Thanks
Is there any built-in method to remove similar characters in a string?
Examples:
aaaabbbccc -> abc
aabbccaa -> abc
Thanks
I would have thought you want to look at using Regular Expressions. For C# .NET this is a useful site...
no.
But can easly be made in a loop, remember you need to build a new string you cannot edit a char in a existing string ( you can do string.remove , but very likely gonna be slow and mess up your loop ).
basicly:
for(int i=0;i<MyText.Length;i++)
{
if(i == 0)
contiune;
if(Text[i] == Text[i - 1])
// Do something, both chars are the same
}
Use the Regex class:
Regex.Replace( "aaabbcc", @"(\w)\1+", "$1" )
will result in
abc
For more infos look here.
EDIT:
Since you I edited your question:
Regex.Replace( "acaabbccbaa", @"(\w)(?<=\1.+)", "" )
will result in
acb
This pattern uses a negative lookbehind to identify doubled chars and replaces them by ""
You could use a HashSet
and build an extension method for this:
static string RemoveDuplicateChars(this string s)
{
HashSet<char> set = new HashSet<char>();
StringBuilder sb = new StringBuilder(s.Length);
foreach (var c in s)
{
if (set.Add(c))
{
sb.Append(c);
}
}
return sb.ToString();
}
or using Enumerable.Distinct
, simply:
Console.WriteLine(new string("aaabbbccaddcacc".Distinct().ToArray()));
Does something like this solve your problem?
string distinct = new string("aaaabbbccc".Distinct().ToArray());
It's a little ugly, but you could wrap it into an extension method:
public static string UniqueChars(this string original)
{
return new string(original.Distinct().ToArray());
}
Hope this helps.
Since you specifically asked about removing "similar" characters, you may want to try something like this:
using System.Globalization;
....
private string RemoveDuplicates(string text)
{
StringBuilder result = new StringBuilder();
string previousTextElement = string.Empty;
TextElementEnumerator textElementEnumerator = StringInfo.GetTextElementEnumerator(text);
textElementEnumerator.Reset();
while (textElementEnumerator.MoveNext())
{
string textElement = (string)textElementEnumerator.Current;
if (string.Compare(previousTextElement, textElement, CultureInfo.InvariantCulture,
CompareOptions.IgnoreCase | CompareOptions.IgnoreNonSpace |
CompareOptions.IgnoreWidth) != 0)
{
result.Append(textElement);
previousTextElement = textElement;
}
}
return result.ToString();
}