I have a list of tags defined in a StringListProperty().
The DB contains around 1 million entries and each entry has around 20 different values in the list.
e.g.
a = [ 'ab', 'bc', 'ca', 'x', ....]
b = ['x', 'm', 'a', .... ]
I am using Google App Engine so I have constraints on running batch jobs ... (only 30 sec allowed)
Here is my question:
Given a list a, I want to find all lists which have most number of elements common with a ... in descending order of number of common elements...
how can i do this with app engine?
***update
I am storing tags for URLs - [shopping, shop, social-shopping, ....]
Basically, I want to find URLs which are of similar content by
(1) matching the tags (2) looking at the frequency of tags per URL to decide which URLs are "more" related content