tags:

views:

134

answers:

5

Does anyone have any ideas, links or algorithms for solving an anagram with PHP and MySQL. If anyone has a good English Dictionary that would be appreciated also.

I am looking to achieve something similar to this:

http://www.ssynth.co.uk/~gay/anagram.html

The guy explains how he did it here http://www.ssynth.co.uk/~gay/anagabout.html ... From what he is saying a language like PHP might not be suitable... Will it be a problem?

Thanks..

A: 

From your link...

store all the words in a tree structure

Databases are very bad at storing hierarchical data like this, so I wouldn't recommend MySQL. You may be able to do some "clever" things with indexes and LIKE clauses, but I would expect that to be rather kludgey.

PHP has everything you need to do the coding for this, but there are probably better alternatives. Perl is known for its ability to do text manipulation. I'm not sure about scripting languages like Python or Ruby.

haydenmuhl
What if I had a table with column's 'word' (e.g cat), 'length' (e.g 3) and A-Z (e.g c=1 a=1 t=1). That way the anagram 'atc' I could do a query like 'SELECT word FROM dictionary WHERE c <= 1 AND a <= 1 AND t <= 1 AND length <= 3' and it would return cat...
Pablo
What's wrong with storing hierarchical data in a database? It forces you to plan out how to represent the data in a manner where it's searchable.
symcbean
@Pablo, I like the solution it's creative but you are going to be wasting a lot of storage.
Tom Gullen
@symcbean, The problem with storing hierarchical data in a relational database is that querying that data to an arbitrary depth is difficult and inefficient.
haydenmuhl
@haydenmuhl: Not it's not. Some DBMS explicitly support the 'simple' relational hierarchy model (see my post and http://www.oracle.com/technology/oramag/oracle/05-may/o35asktom.html) very efficiently to an arbitary depth. And there are lots of other ways of modeling a tree in a relational database, e.g. http://www.sitepoint.com/article/hierarchical-data-database/
symcbean
@symcbean, The OP is using MySQL, not Oracle.
haydenmuhl
A: 

From what he is saying a language like PHP might not be suitable

How do you get that from the details he's published?

If anyone has a good English Dictionary...

There's one in the pspell extension although given the nature of the algorithm presented it may be more efficient to push most of the logic (and the dictionary) into the database - IIRC pspell uses a custom format albeit a documented one

symcbean
A: 

What exactly looks wrong in the algorithm Pablo suggested? I was going to suggest the same ;)

What if I had a table with column's 'word' (e.g cat), 'length' (e.g 3) and A-Z (e.g c=1 a=1 t=1). That way the anagram 'atc' I could do a query like 'SELECT word FROM dictionary WHERE c <= 1 AND a <= 1 AND t <= 1 AND length <= 3' and it would return cat

Redirect upvoting (if any) to his comment please.

Also there is a similiar question: http://stackoverflow.com/questions/55210/algorithm-to-generate-anagrams

Also you need to check Google: http://www.google.ru/search?q=anagram+solving+algorithms&amp;ie=utf-8&amp;oe=utf-8&amp;aq=t&amp;rls=org.mozilla:ru:official&amp;client=firefox

FractalizeR
A: 

I would have a table {letter} {word} {count} and for each word, store it along with each of it's component letters, and how many times the letter appears in the word. Then a search for an anagrams starts with searching for a set of letters, and finding the intersection between the sets of words each letter is associated with. For example

Input: rat Table:

T tar 1
A tar 1
R tar 1
C cat 1
A cat 1
T cat 1
C car 1
A car 1
R car 1

Results for each letter

R car tar
A cat car tar
T cat tar

And then you join each query with an intersection!

Breton
A: 

You could use a Trie datastructure to loop through every combination of character sequence (and obviously stop the current node if there are no subnodes).

This would produce a full list of all possible solutions in a fairly efficient way. With a limited starting character set I think it would work well.

At each node you can select the count of matching words, and when it is suitably small enough, load that into an array for comparison so you don't need to run a million selects.

Tom Gullen