ansaurus

Question

how to remove duplicate characters in a string?

Answer 1

+4 A:

It will do the job

string removedupes(string s)
{
    string newString = string.Empty;
    List<char> found = new List<char>();
    foreach(char c in s)
    {
       if(found.Contains(c))
          continue;

       newString+=c.ToString();
       found.Add(c);
    }
    return newString;
}

I should note this is criminally inefficient.

I think I was delirious on first revision.

Quintin Robinson 2009-02-26 01:47:26

am I guessing correctly that you intentionally left the inefficiencies as an exercise to the reader, or do you want suggestions on making this work faster?

SquareCog 2009-02-26 01:49:20

Indeed you are correct, if it is homework then the OP can filter through and create one that isn't terrible. It also serves as a baseline for understanding what is happening. I don't need suggestions on improvements, thanks though.

Quintin Robinson 2009-02-26 01:56:48

Answer 2

+4 A:

For arbitrary length strings of byte-sized characters (not for wide characters or other encodings), I would use a lookup table, one bit per character (32 bytes for a 256-bit table). Loop through your string, only output characters that don't have their bits turned on, then turn the bit on for that character.

string removedupes(string s)
{
    string t;
    byte[] found = new byte[256];
    foreach(char c in s)
    {
        if(!found[c]) {
            t.Append(c);
            found[c]=1;
        }
    }
    return t;
}

I am not good with C#, so I don't know the right way to use a bitfield instead of a byte array.

If you know that your strings are going to be very short, then other approaches would offer better memory usage and/or speed.

Sparr 2009-02-26 01:48:26

I think this will be significantly faster than Quintin Robinson's approach, but will use significantly more memory for short strings.

Sparr 2009-02-26 01:54:39

But significantly less memory for medium or long strings, if a bit array is used.

Sparr 2009-02-26 03:03:45

Your heart is in the right place, but your logic is a bit off. It should be if(found[c]){t+=c; found[c] = 1;} No else block needed. Your current code won't do the trick.

BFree 2009-02-26 03:22:29

Answer 3

+3 A:

It sounds like homework to me, so I'm just going to describe at a high level.

Loop over the string, examining each character
Check if you've seen the character before
- if you have, remove it from the string
- if you haven't, note that you've now seen that character

Chad Birch 2009-02-26 01:49:11

Answer 4

A:

I think this article on .NET perls covers it very well.

-John

John T 2009-02-26 01:50:44

Answer 5

+6 A:

A Linq approach:

public static string RemoveDuplicates(string input)
{
    return new string(input.ToCharArray().Distinct().ToArray());
}

CMS 2009-02-26 02:14:51

@CMS ahh simplicity

msarchet 2010-07-24 17:20:57

Answer 6

A:

char *remove_duplicates(char *str) { char *str1, *str2;

if(!str)
    return str;

str1 = str2 = str;

while(*str2)            
{   
    if(strchr(str, *str2)<str2)
    {
        str2++;
        continue;
    }

    *str1++ = *str2++;      
}
*str1 = '\0';

return  str;

}

mbm 2010-07-24 17:12:30

Answer 7

A:

char* removeDups(const char* str)
{
 char* new_str = (char*)malloc(256*sizeof(char));
 int i,j,current_pos = 0,len_of_new_str;
 new_str[0]='\0';

 for(i=0;i<strlen(str);i++)
{
 len_of_new_str = strlen(new_str);
for(j=0;j<len_of_new_str && new_str[j]!=str[i];j++)
   ;
  if(j==len_of_new_str)
   {
     new_str[len_of_new_str] = str[i];
     new_str[len_of_new_str+1] = '\0';
   }
}
  return new_str;
}

Hope this helps

Tracy 2010-10-24 15:54:41

ansaurus

tags:

views:

answers:

how to remove duplicate characters in a string?

related questions