I am trying to create a query using a db on OpenOffice where a string is entered in the query, and all permutations of the string are searched in the database and the matches are displayed. My database has fields for a word and its definition, so if I am looking for GOOD I will get its definition as well as the definition for DOG.
Basically, you can't easily do permutations in single SQL statement. You can easily do them in another language though, for example here's how to do it in C#: http://msdn.microsoft.com/en-us/magazine/cc163513.aspx
You'll need a third column as well. In this column you'll have the word - but with the letters sorted in alphabetical order. For example, you'll have the word APPLE and in the next column the word AELPP. You would sort the word your looking for - and run a some SQL code like
WHERE sorted_words = 'my_sorted_word'
for the word apple, you would get something like this:
unsorted sorted
AELPP APPLE
AELPP PEPLA
AELPP APPEL
Now, you also wanted - correct me if I'm wrong, but you want all the words that can be made with **any combination ** of the letters, meaning APPLE also returns words like LEAP and PEA.
To do this, you would have to use some programming language - you would have to write a function that preformed the above recursively, for example - for the word AELLP you have
- ELLP
- ALLP
- AELP and so forth.. (each time subtracting one letter in every combination, and then two letters in every combination possible ect..)
Ok, corrected version that I think handles all situations. This will work in MS SQL Server, so you may need to adjust it for your RDBMS as far as using the local table and the REPLICATE function. It assumes a passed parameter called @search_string. Also, since it's using VARCHAR instead of NVARCHAR, if you're using extended characters be sure to change that.
One last point that I'm just thinking of now... it will allow duplication of letters. For example, "GOOD" would find "DODO" even though there is only one "D" in "GOOD". It will NOT find words of greater length than your original word though. In other words, while it would find "DODO", it wouldn't find "DODODO". Maybe this will give you a starting point to work from though depending on your exact requirements.
DECLARE @search_table TABLE (search_string VARCHAR(4000))
DECLARE @i INT
SET @i = 1
WHILE (@i <= LEN(@search_string))
BEGIN
INSERT INTO @search_table (search_string)
VALUES (REPLICATE('[' + @search_string + ']', @i)
SET @i = @i + 1
END
SELECT
word,
definition
FROM
My_Words
INNER JOIN @search_table ST ON W.word LIKE ST.search_string
The original query before my edit, just to have it here:
SELECT
word,
definition
FROM
My_Words
WHERE
word LIKE REPLICATE('[' + @search_string + ']', LEN(@search_string))
Suppose you have a auxiliary Numbers table with integer numbers.
DECLARE @s VARCHAR(5);
SET @s = 'ABCDE';
WITH Subsets AS (
SELECT CAST(SUBSTRING(@s, Number, 1) AS VARCHAR(5)) AS Token,
CAST('.'+CAST(Number AS CHAR(1))+'.' AS VARCHAR(11)) AS Permutation,
CAST(1 AS INT) AS Iteration
FROM dbo.Numbers WHERE Number BETWEEN 1 AND 5
UNION ALL
SELECT CAST(Token+SUBSTRING(@s, Number, 1) AS VARCHAR(5)) AS Token,
CAST(Permutation+CAST(Number AS CHAR(1))+'.' AS VARCHAR(11)) AS
Permutation,
s.Iteration + 1 AS Iteration
FROM Subsets s JOIN dbo.Numbers n ON s.Permutation NOT LIKE
'%.'+CAST(Number AS CHAR(1))+'.%' AND s.Iteration < 5 AND Number
BETWEEN 1 AND 5
--AND s.Iteration = (SELECT MAX(Iteration) FROM Subsets)
)
SELECT * FROM Subsets
WHERE Iteration = 5
ORDER BY Permutation
Token Permutation Iteration
----- ----------- -----------
ABCDE .1.2.3.4.5. 5
ABCED .1.2.3.5.4. 5
ABDCE .1.2.4.3.5. 5
(snip)
EDBCA .5.4.2.3.1. 5
EDCAB .5.4.3.1.2. 5
EDCBA .5.4.3.2.1. 5
(120 row(s) affected)