I am trying to return the top ten entities that are associated with articles after doing a search on the articles.
Lets say I have the following set up:
CREATE TABLE IF NOT EXISTS `articles` (
`id` char(36) NOT NULL,
`html_content` text NOT NULL,
`featured` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
FULLTEXT KEY `html_content` (`html_content`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
CREATE TABLE IF NOT EXISTS `entities` (
`id` char(36) NOT NULL,
`title` varchar(255) NOT NULL,
`article_count` int(11) NOT NULL,
PRIMARY KEY (`id`),
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
// Pivot table as the articles -> entities is a many-to-many Relationship
CREATE TABLE IF NOT EXISTS `articles_entities` (
`id` char(36) NOT NULL,
`article_id` char(36) NOT NULL,
`entity_id` char(36) NOT NULL,
PRIMARY KEY (`id`),
KEY `article_id` (`article_id`),
KEY `entity_id` (`entity_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
A little background, The search returns approx 15K articles and growing daily, each article can have upwards of 20 entities associated with it. Everything I try causes the script to take ages to complete.
Lets say I am searching for all articles that have the word 'google' in them, and I want to return the 10 most associated entities. How would I go about this?
I am using cakePHP, but any solution is ok.
Thanks in advance!