Hello community,
I'm learning to program with Ruby on Rails and I've reached my hardest task yet. I have two tables, each containing a list of business categories (I.E. Thai Food Restaurant, Plumber, Office Supply Store). These two tables come from two different APIs where I'm essentially acting as the middle-man between them. They list more or less the same types of categories, but often times they will phrase it differently (I.E. auto body repair and painting VS automobile body repair and painting).
My first goal was to design the model for this task. I decided upon a "many to many" and tested it by manually mapping two rows. Done. Tested. Awesome.
My second goal is to write an algorithm to match rows from the two tables to each other, giving precedence based on similarity. I'm guessing a lot of the work can be done in MySQL. Here is my pseudo-code so far:
for each row in table 1
split phrase up into words by spaces
SELECT name, id FROM joined_table
SELECT name, id FROM table2 AS word1 WHERE name LIKE '% word1 %'
SELECT name, id FROM table2 AS word2 WHERE name LIKE '% word2 %'
SELECT name, id FROM table2 AS word3 WHERE name LIKE '% word3 %'
JOIN word1, word2, word3 WHERE word1.id == word2.id
OR word2.id == word3.id
order by count of matches of each word
insert relationships into map table
end
I've never designed a search algorithm before, so any help you could lend is much appreciated. I'm having fun figuring this out, but I thought I'd reach out and get some advice from the pro's out there.
Cheers!
Update: A co-worker recommended I check out a website called mechanical turk, which turned out to be the most cost-effective way of mapping the categories. All I had to do was build a simple form and it wound costing approximately $3.00 per thousand matches.