views:

443

answers:

8

I was asked this one when I applied for my current job.

+6  A: 

Sort each element (removing whitespace) and compare against the previous. If they are all the same, they're all anagrams.

Cody Brocious
Remove punctuation as well
dmckee
punctuation I can understand, for words with an apostrohphe, but spaces? I don't know many words with spaces in them... I think for the sake of a simple exercise like this you can safely assume that the words contain only letters.
ninesided
When presented as puzzles, anagrams are often spread over whole phrases. So you write the routine to be robust.
dmckee
+18  A: 

Put all the letters in alphabetical order in the string (sorting algorithm) and then compare the resulting string.

Adam Davis
Yep, that's pretty much what I came up with... Got the job too!
jqs
There's an alternative algorithm involving counting the characters in each word. It's faster, but it'll be more expensive for Unicode words.
Franci Penov
I had considered that, but then you have to compare the resulting letter count arrays, hashes, or otherwise - for short anagrams my algorithm is likely faster, but for larger anagrams chances are good yours would be faster. Would be an interesting test...
Adam Davis
You wouldn't need to compare all the maps - just get one map for the first word and then for all the others see if they _match_ the same map, i.e. iterate their letters and decrement them in a copy of the map. In the end you have found all the letters and the map counts must be zeroed.
Daniel Daranas
I understand what you're saying. But the map has 26 positions. Once you've done the increment/decrement you have to go through 26 comparisons to 0 to verify the map matches. I'd have to do more research to find out which method requires more comparisons - although comparisons to 0 are cheaper...
Adam Davis
I think that method may be faster than a sorting and string comparisons as I consider it more... Maybe I'll have to do those tests...
Adam Davis
Especially if you cast the array as integers before comparing to 0 - then you've only got 7 integer comparisons, or 4 if you're on a 64 bit machine.
Adam Davis
A: 

For each word, compute a hash of the letters of that word (sorted however you like, probably ascending alphabetically). For each word with the same hash, they're an anagram.

Randolpho
This is very risky. For any hash algorithm, there may exist different words with like hashes.
Paul Brinkley
While that's technically true, it's very unlikely. If by "words" he meant dictionary words, any half-decent hashing algorithm will be perfect. TRWTF is that this will take ages, because instead of comparing words, you're hashing, then comparing hashes. Now you need an algorithm for the latter :)
Robert Grant
That's true, but it's extremely unlikely.
Randolpho
Very true, Robert. I dunno what I was thinking when I suggested hashing. Actually, I do: I was thinking of using a hashtable to compartmentalize the words with the sorted character string as the key. Sometimes I over complicate things. :D
Randolpho
+1  A: 

Sort the letters and compare (letter by letter, string compare, ...) is the first things that comes to mind.

Austin Salonen
+2  A: 

The following algorithm should work:

  1. Sort the letters in each word.

  2. Sort the sorted lists of letters in each list.

  3. Compare each element in each list for equality.

Jekke
Once the lists of letters are sorted you could compare the first to the last instead of comparing each one. If the first is the same as the last, then they are all equal.
Eric Ness
+3  A: 

Interestingly enough, Eric Lippert's Fabulous Adventures In Coding Blog dealt with a variation on this very problem on February 4, 2009 in this post.

Grant Wagner
+7  A: 

Good thing we all live in the C# reality of in-place sorting of short words on quad core machines with oozles of memory. :-)

However, if you happen to be memory constrained and can't touch the original data and you know that those words contain characters from the lower half of the ASCII table, you could go for a different algorithm that counts the occurrence of each letter in each word instead of sorting.

You could also opt for that algorithm if you want to do it in O(N) and don't care about the memory usage (a counter for each Unicode char can be quite expensive).

Franci Penov
A: 
  1. compare length (if not equal, not a chance)
  2. make a bit vector of the length of the strings
  3. for each char in the first string find occurrences of it in the second
  4. set the bit for the first unset occurrence
  5. if you can find one stop with fail
BCS