similarity

Efficiently updating cosine similarity scores

My iPhone application is using a SQLite database with the following schema: items(id, name, ...) -> this table contains 50 records tags(id, name) -> this table contains 50 records item_tags(id, item_id, tag_id, user_id) similarities(id, item1_id, item2_id, score) The items, tags, item_tags and similarities tables are populated with p...

Calculating similarity between drawn lines

I need an algorithm to calculate, numerically, the degree of similarity between two drawn lines. The lines are drawn using a mouse, and are stored as a set of cartesian coordinates before being filtered and smoothed using separate algorithms. For example, within the following diagram: Lines A and B are clearly similar, but B and C are...

Find duplicate PDFs

I'm looking for a utility that will help me find duplicate PDFs. The problem: I have a 1000s of PDF files. Some are duplicates. They are not easy to detect due differing files names and small differences in file size. Is there a utility/algorithm/library that can help me find the duplicates or show me files that are very similar (or...

Measuring the similarity between two binary files???

Hi Guys, I have two G729 encoded files, i took the pcm version of them. i want to measure the similarity between these two files. these files are binary files so how one can measure the similarity between binary files, i wrote a code in C that takes patterns from the first one and search for similar ones in the second one, but i want to ...

Searching for similar groupings; including diff and score (ie. Similar Recipes)

I'm trying to find the best way to determine how similar a group of items (in this example; ingredients in a guacamole recipe) is to all groups of items (recipes in a table; linked to another table of ingredients). For instance; I have the following guacamole recipe: 3 Avocados 1 Vine-Ripened Tomatoes 1 Red Onion 3 Jalapenos 1 Sea Salt...

similarity metric to compare 2 sets of 2D points?

I am using a software tool SentiWordNet that can map an English word to a pair of numbers showing how positive and negative the word is. The pair of numbers p and n satisfies the 3 conditions. p ≥ 0 n ≥ 0 p + n ≤ 1 One can process each word of, say, a movie review and then find out the overall positiveness/negativeness. I was t...

How to make colours on one screen look the same as another

Given two seperate computers, how could one ensure that colours are being projected roughly the same on each screen? IE, one screen might have 50% brightness more than another, so colours appear duller on one screen. One artist on one computer might be seeing the pictures differently to another, it's important they are seeing the same ...

passing text through a dictionary in Python

I currently have python code that compares two texts using the cosine similarity measure. I got the code here. What I want to do is take the two texts and pass them through a dictionary (not a python dictionary, just a dictionary of words) first before calculating the similarity measure. The dictionary will just be a list of words, alt...

n-gram sentence similarity with cosine similarity measurement

Hi all, I have been working on a project about sentence similarity. I know it has been asked many times in SO, but I just want to know if my problem can be accomplished by the method I use by the way that I am doing it, or I should change my approach to the problem. Roughly speaking, the system is supposed to split all sentences of an ar...