ansaurus

Question

Write a function that compares two strings and returns a third string containing only the letters that appear in both

Answer 1

A:

Looks OK to me, but for god sakes man - use some descriptive variable names!!

JustAPleb 2010-01-15 10:12:20

Updated variable name...Thanks

Pritam Karmakar 2010-01-15 10:16:03

Answer 2

+6 A:

That's fine for a first approach, but you can make a few improvements, and there's a small error.

If b contains a character in a that's already in c, you'll repeat it.
To avoid repeats, you might consider using a Set to store the characters, since a Set won't have repeats.
Assembling strings with += concatenation is usually inefficient; consider using a StringBuilder or an analogous string-assembly class.
Your variable names aren't very descriptive.
If a or b are empty, you don't have to do any work at all! Just return an empty string.

You can think about some more sophisticated improvements, too, by imagining how your algorithm scales if you started to use huge strings. For example, one approach might be that if one string is much longer than the other, you can sort the longer one and remove duplicates. Then you can do a binary search on the characters of the shorter string very quickly.

John Feminella 2010-01-15 10:13:57

I have updated my code as per your comments...thak u so much

Pritam Karmakar 2010-01-15 10:30:44

an example in LINQ could be also interesting!

serhio 2010-01-15 10:36:07

hi, to upgrade your performance, you can even check the smallest string and loop through his chars. that way you'll save the number of loops.

Sem Dendoncker 2010-01-15 10:36:20

To scale for huge strings, using a hashset or an array lookup is better IMO.

ChrisW 2010-01-15 10:44:07

HashSet, not Set - don't get Java on us :P

Chris S 2010-01-15 10:48:22

Also I'm guessing the task isn't to compare kb of string data, so performance is premature

Chris S 2010-01-15 10:49:35

@ChrisS » It's a homework problem, so I was trying to encourage a little prodding and self-introspection. :) (Also, there wasn't a C# tag initially and it wasn't clear what language it was in.)

John Feminella 2010-01-15 10:51:02

Answer 3

+1 A:

Seems fine. You could do a pair of optimizations depending on the language you are using:

you can collect the letters of b into some ordered structure (to make the search for it faster), and if it doesn't repeat... better (a set).
you can use some kind of StrignBuilder (if it's Java or .Net) to not recreate a string with each concatenation inside the loop

Anyway this are optimizations that are good for large, large strings... so, I don't know if their apropiate to your use, or to the intended homework.

helios 2010-01-15 10:14:08

helios 2010-01-15 10:15:15

Answer 4

+1 A:

Depends on how long are the input strings, what is the letter and how output should look (duplicates) there are a few other approaches.

For example:

If letters are just [A-Z] characters and each letter should appear only once in output string you can build separate string (or table of chars) 'ABC...XZ' (let's call it letters) and run for each loop over letters and check both input strings against each letter from letters.

This gives 26 loop iterations and no more then 52 calls of Contains() method for each input strings -- no matter how long input strings are.

Grzegorz Gierlik 2010-01-15 10:24:38

However each Contains call may still be expensive, because you're invoking it on a lengthy string value.

ChrisW 2010-01-15 12:40:24

Answer 5

+13 A:

For an alternative solution, you can view the strings as enumerables and use Intersect() like this:

    public static string Common(string first, string second)
    {
        return new string((first.Intersect(second)).ToArray());
    }

Brian Rasmussen 2010-01-15 10:29:11

+1. It never occurred to me you could use a string directly as an enumerable like so without calling .ToCharArray()

Winston Smith 2010-01-15 11:13:55

Answer 6

+1 A:

To improve on John Feminella's last suggestion, quicker than a binary search (for any sufficiently long string) would be a lookup in a hashset; or a lookup in a 256-element array of booleans, if those are ASCII or UTF8 characters instead of UTF16 characters.

Instantiate an empty hashset, or an empty array of booleans; name this variable found.
For each char in the first string, either add the char to the hashset (but beware of duplicate characters in the first string), or set the corresponding element in the found array to true; this will take linear O(n) time.
For each char in the second string, test whether the char exists in the hashset or whether the corresponding element in the 'found' array is true: if found then add the character to the return string, and also remove the character from the hashset or the clear the boolean element in the array, so that it won't be found again (to beware of duplicate characters in the second string); this will take linear O(n) time.

ChrisW 2010-01-15 10:39:27

Answer 7

A:

void get_common_letters_as_string( char *string_one, char *string_two, char **string_common )
{
  char
    *a,
    *b,
    *c;

  *string_common = malloc( strlen(string_one) + strlen(string_two) + 1 );

  qsort( string_one, strlen(string_one), sizeof(char), compare_function );
  qsort( string_two, strlen(string_two), sizeof(char), compare_function );

  a = string_one;
  b = string_two;
  c = *string_common;

  // BX : loop over all string_one    
  while( *a != '\0' )
  {
    // BX : if current char of one and two match, copy to common
    if( *a == *b )
      *c++ = *a;

    // BX : now loop over all matching chars in string two
    while( *a == *b and *b != '\0' )
      b++;

    // BX : now move along one char in string one
    a++;
  }

  *c = '\0';

  return;
}

Blank Xavier 2010-01-15 10:47:43

-1 for just giving source code in reply to a homework question. If the OP uses it then he'd be plagiarising; and the teacher might be searching the internet for students' answers, to find whether they're plagiarising. It's better IMO to answer using pseudo-code.

ChrisW 2010-01-15 10:52:58

On the other hand, actual C code teaches people C in a way psuedo code never can. Is our main concern teaching people, or is our main concern merely catching them if they cheat?

Blank Xavier 2010-01-15 10:55:01

Yes. Still, to learn it they need to read and write it, and if you write it for them then they won't/can't. If you were concerned about his style (instead of his algorithm), you could have pointed to some existing code with a better style. Sorry to criticize your answer, I just think there's is (or ought to be) this special/unusual consideration when the question is a homework question.

ChrisW 2010-01-15 11:01:33

In general I agree with you; but in this case, the guy already provided a working solution.

Blank Xavier 2010-01-15 13:38:32

Answer 8

A:

Using LINQ:

a.ToCharArray().Intersect<char>(b.ToCharArray())

However this is case sensitive.

Kniganapolke 2010-01-15 11:06:45

ansaurus

tags:

views:

answers:

Write a function that compares two strings and returns a third string containing only the letters that appear in both

related questions