What is an easy way to tell if a list of words are anagrams of each other?

tags:

interview-questions

views:

443

answers:

+5 Q:

What is an easy way to tell if a list of words are anagrams of each other?

I was asked this one when I applied for my current job.

+6 A:

Sort each element (removing whitespace) and compare against the previous. If they are all the same, they're all anagrams.

Cody Brocious 2009-02-06 20:54:51

Remove punctuation as well

dmckee 2009-02-06 20:57:50

punctuation I can understand, for words with an apostrohphe, but spaces? I don't know many words with spaces in them... I think for the sake of a simple exercise like this you can safely assume that the words contain only letters.

ninesided 2009-02-06 21:02:53

When presented as puzzles, anagrams are often spread over whole phrases. So you write the routine to be robust.

dmckee 2009-02-07 00:45:11

+18 A:

Put all the letters in alphabetical order in the string (sorting algorithm) and then compare the resulting string.

Adam Davis 2009-02-06 20:54:56

Yep, that's pretty much what I came up with... Got the job too!

jqs 2009-02-06 21:03:23

There's an alternative algorithm involving counting the characters in each word. It's faster, but it'll be more expensive for Unicode words.

Franci Penov 2009-02-06 21:42:26

I had considered that, but then you have to compare the resulting letter count arrays, hashes, or otherwise - for short anagrams my algorithm is likely faster, but for larger anagrams chances are good yours would be faster. Would be an interesting test...

Adam Davis 2009-02-06 21:50:55

You wouldn't need to compare all the maps - just get one map for the first word and then for all the others see if they _match_ the same map, i.e. iterate their letters and decrement them in a copy of the map. In the end you have found all the letters and the map counts must be zeroed.

Daniel Daranas 2009-02-10 13:28:34

I understand what you're saying. But the map has 26 positions. Once you've done the increment/decrement you have to go through 26 comparisons to 0 to verify the map matches. I'd have to do more research to find out which method requires more comparisons - although comparisons to 0 are cheaper...

Adam Davis 2009-02-10 16:39:39

I think that method may be faster than a sorting and string comparisons as I consider it more... Maybe I'll have to do those tests...

Adam Davis 2009-02-10 16:42:38

Especially if you cast the array as integers before comparing to 0 - then you've only got 7 integer comparisons, or 4 if you're on a 64 bit machine.

Adam Davis 2009-02-10 16:44:33

For each word, compute a hash of the letters of that word (sorted however you like, probably ascending alphabetically). For each word with the same hash, they're an anagram.

Randolpho 2009-02-06 20:55:25

This is very risky. For any hash algorithm, there may exist different words with like hashes.

Paul Brinkley 2009-02-06 21:11:11

While that's technically true, it's very unlikely. If by "words" he meant dictionary words, any half-decent hashing algorithm will be perfect. TRWTF is that this will take ages, because instead of comparing words, you're hashing, then comparing hashes. Now you need an algorithm for the latter :)

Robert Grant 2009-02-06 21:59:30

That's true, but it's extremely unlikely.

Randolpho 2009-02-06 22:00:30

Very true, Robert. I dunno what I was thinking when I suggested hashing. Actually, I do: I was thinking of using a hashtable to compartmentalize the words with the sorted character string as the key. Sometimes I over complicate things. :D

Randolpho 2009-02-06 22:02:20

+1 A:

Sort the letters and compare (letter by letter, string compare, ...) is the first things that comes to mind.

Austin Salonen 2009-02-06 20:55:42

+2 A:

The following algorithm should work:

Sort the letters in each word.
Sort the sorted lists of letters in each list.
Compare each element in each list for equality.

Jekke 2009-02-06 20:57:15

Once the lists of letters are sorted you could compare the first to the last instead of comparing each one. If the first is the same as the last, then they are all equal.

Eric Ness 2009-02-06 22:03:30

+3 A:

Interestingly enough, Eric Lippert's Fabulous Adventures In Coding Blog dealt with a variation on this very problem on February 4, 2009 in this post.

Grant Wagner 2009-02-06 21:02:00

+7 A:

Good thing we all live in the C# reality of in-place sorting of short words on quad core machines with oozles of memory. :-)

However, if you happen to be memory constrained and can't touch the original data and you know that those words contain characters from the lower half of the ASCII table, you could go for a different algorithm that counts the occurrence of each letter in each word instead of sorting.

You could also opt for that algorithm if you want to do it in O(N) and don't care about the memory usage (a counter for each Unicode char can be quite expensive).

Franci Penov 2009-02-06 21:30:04

compare length (if not equal, not a chance)
make a bit vector of the length of the strings
for each char in the first string find occurrences of it in the second
set the bit for the first unset occurrence
if you can find one stop with fail

BCS 2009-02-18 08:10:41

ansaurus

tags:

views:

answers:

What is an easy way to tell if a list of words are anagrams of each other?

related questions