nearest-neighbor

How to match Tagged items based on "similarity"

I have a real question. I have a database with the schema as follows: item id description other junk tag id name item2tag item_id tag_id count Basically, each item is tagged as up to 10 things, with varying counts. There are 50,000 items and 50,000 tags, and about 500,000 entries in items2tag. I'd like to find, given one...

SQL efficient nearest neighbour query

I'm having trouble coming up with an efficient SQL query to handle the following situation: Assume we have a table with two columns groupId : int value : float The table is huge (several million rows). There are a varying amount of "values" per "groupId" - say something between 100 and 50.000. All float values are greater or equal t...

Algorithm for similarity (of topic) of news items

I want to determine the similarity of the content of two news items, similar to Google news but different in the sense that I want to be able determine what the basic topics are then determine what topics are related. So if an article was about Saddam Hussein, then the algorithm might recommend something about Donald Rumsfeld's business...

Nearest neighbor on a unit sphere, with roughly evenly distributed points.

I'm writing a program that implements SCVT (Spherical Centroidal Voronoi Tesselation). I start with a set of points distributed over the unit sphere (I have an option for random points or an equal-area spiral). There will be from a several hundred to maybe 64K points. I then need to produce probably several million random sample point...

How to find the previous and next record using a single query in MySQL?

Hello, I have a database, and I want to find out the previous and next record ordered by ID, using a single query. I tried to do a union but that does not work. :( SELECT * FROM table WHERE `id` > 1556 LIMIT 1 UNION SELECT * FROM table WHERE `id` <1556 ORDER BY `product_id` LIMIT 1 Any ideas? Thanks a lot. ...

getting nearest result within IFNULL

I've got a maintenance script which dumps a bunch of data from one db to another. I'm trying to get the data as SELECT id, IFNULL(rank1,(SELECT rank2 FROM table WHERE rank1 IS NOT NULL and rank2<rank2 of current row ORDER BY rank2 LIMIT 1)) FROM table What I'm attempting to get with the rank2 So, i believe I h...

Drawing a line in PostGIS using Nearest Neighbour Method

This is a cross post from an email I sent to the PostGIS mailing list So far in my endeavor to create a line between a point and its projected location on a line has been long but I'm almost there. As of yesterday, and before including any nearest neighbor analysis, I got the results shown in this image: As you can see, each point in...

Data structure for fast line queries?

I know that I can use a KD-Tree to store points and iterate quickly over a fraction of them that are close to another given point. I'm wondering whether there is something similar for lines. Given a set of lines L in 3D (to be stored in that data structure) and another "query line" q, I'd like to be able to quickly iterate through all l...

Slow Postgres Query

I'm new to Postgres and SQL. I created the following script that draws a line from a point to a projected point on the nearest line. It works fine on a small data set 5 to 10 points with the same number of lines; however, doing it on 60 points with 2,000 lines, the query takes about 12 hours. it is based on a nearest neighbour function p...

Nearest-neighbor interpolation algorithm in MATLAB

I am trying to write my own function for scaling up an input image by using the Nearest-neighbor interpolation algorithm. The bad part is I am able to see how it works but cannot find the algorithm itself. I will be grateful for any help. Here's what I tried for scaling up the input image by a factor of 2: function output = nearest(inp...

How to find k nearest neighbors to the median of n distinct numbers in O(n) time?

I can use the median of medians selection algorithm to find the median in O(n). Also, I know that after the algorithm is done, all the elements to the left of the median are less that the median and all the elements to the right are greater than the median. But how do I find the k nearest neighbors to the median in O(n) time? If the med...

nearest neighbor - k-d tree - wikipedia proof.

On the wikipedia entry for k-d trees, an algorithm is presented for doing a nearest neighbor search on a k-d tree. What I don't understand is the explanation of step 3.2. How do you know there isn't a closer point just because the difference between the splitting coordinate of the search point and the current node is greater than the d...

is KNN valuable if most ratings are a 5 / passive filtering recommendations

I've been looking at building a 'people who like x, also like y' type recommendation system, and was looking at using Vogoo, but after looking through their code it seems there is a lot of nearest neighbor based on ratings. Over the last few weeks I've seen a few articles stating that most people either don't rate at all, or rate a 5 h...

How does space partitioning algorithm for finding nearest neighbors work?

For finding the nearest neighbor, Space Partitioning is one of the algorithms. How does it work? Suppose I have a 2D set of points (x and y coordinates), and I am given a point (a,b). How would this algorithm find out the nearest neighbor? ...

How To Rotate Image By Nearest Neighbor Interpolation Using Matlab

My Plain Code without interpolation: im1 = imread('lena.jpg');imshow(im1); [m,n,p]=size(im1); thet = rand(1); m1=m*cos(thet)+n*sin(thet); n1=m*sin(thet)+n*cos(thet); for i=1:m for j=1:n t = uint16((i-m/2)*cos(thet)-(j-n/2)*sin(thet)+m1/2); s = uint16((i-m/2)*sin(thet)+(j-n/2)*cos(thet)+n1/2); if t~=0...

nearest neighbor mapping of 1D index for 2D array into a smaller 2D array

This is in C. I have two 2D arrays, ArrayA and ArrayB, that sample the same space. B samples a different attribute than ArrayA less frequently than ArrayA, so it is smaller than A. Just to try to define some variables: ArrayA: SizeAX by SizeAY, indexed by indexA for a position posAX, posAY ArrayB: SizeBX by SizeAY, indexed by indexB...

Is k-d tree efficient for kNN search. k nearest neighbors search

I have to implement k nearest neighbors search for 10 dimensional data in kd-tree. But problem is that my algorithm is very fast for k=1, but as much as 2000x slower for k>1 (k=2,5,10,20,100) Is this normal for kd trees, or am I doing something worng? ...

How to model coarse grained geolocation to correct market

Use case example: Client A comes to request sales information, enters their zip code and are directed to Representative X. Since there is an effectively infinite number of zip codes there will not be an agent assigned to every single zip code which would then move out to the county level, and then so on into a region of counties and fin...

How can I extend this SQL query to find the k nearest neighbors?

I have a database full of two-dimensional data - points on a map. Each record has a field of the geometry type. What I need to be able to do is pass a point to a stored procedure which returns the k nearest points (k would also be passed to the sproc, but that's easy). I've found a query at http://blogs.msdn.com/isaac/archive/2008/10/23/...

Efficient method for finding KNN of all nodes in a KD-Tree

I'm currently attempting to find K Nearest Neighbor of all nodes of a balanced KD-Tree (with K=2). My implementation is a variation of the code from the Wikipedia article and it's decently fast to find KNN of any node O(log N). The problem lies with the fact that I need to find KNN of each node. Coming up with about O(N log N) if I ...