I have a real question.
I have a database with the schema as follows:
item
id
description
other junk
tag
id
name
item2tag
item_id
tag_id
count
Basically, each item is tagged as up to 10 things, with varying counts. There are 50,000 items and 50,000 tags, and about 500,000 entries in items2tag. I'd like to find, given one...
I'm having trouble coming up with an efficient SQL query to handle the following situation:
Assume we have a table with two columns
groupId : int
value : float
The table is huge (several million rows). There are a varying amount of "values" per "groupId" - say something between 100 and 50.000. All float values are greater or equal t...
I want to determine the similarity of the content of two news items, similar to Google news but different in the sense that I want to be able determine what the basic topics are then determine what topics are related.
So if an article was about Saddam Hussein, then the algorithm might recommend something about Donald Rumsfeld's business...
I'm writing a program that implements SCVT (Spherical Centroidal Voronoi Tesselation). I start with a set of points distributed over the unit sphere (I have an option for random points or an equal-area spiral). There will be from a several hundred to maybe 64K points.
I then need to produce probably several million random sample point...
Hello,
I have a database, and I want to find out the previous and next record ordered by ID, using a single query. I tried to do a union but that does not work. :(
SELECT * FROM table WHERE `id` > 1556 LIMIT 1
UNION
SELECT * FROM table WHERE `id` <1556 ORDER BY `product_id` LIMIT 1
Any ideas?
Thanks a lot.
...
I've got a maintenance script which dumps a bunch of data from one db to another.
I'm trying to get the data as
SELECT id, IFNULL(rank1,(SELECT rank2
FROM table
WHERE rank1 IS NOT NULL and
rank2<rank2 of current row
ORDER BY rank2 LIMIT 1)) FROM table
What I'm attempting to get with the rank2
So, i believe I h...
This is a cross post from an email I sent to the PostGIS mailing list
So far in my endeavor to create a line between a point and its projected
location on a line has been long but I'm almost there. As of yesterday, and
before including any nearest neighbor analysis, I got the results shown in
this image:
As you can see, each point in...
I know that I can use a KD-Tree to store points and iterate quickly over a fraction of them that are close to another given point. I'm wondering whether there is something similar for lines.
Given a set of lines L in 3D (to be stored in that data structure) and another "query line" q, I'd like to be able to quickly iterate through all l...
I'm new to Postgres and SQL. I created the following script that draws a line from a point to a projected point on the nearest line. It works fine on a small data set 5 to 10 points with the same number of lines; however, doing it on 60 points with 2,000 lines, the query takes about 12 hours. it is based on a nearest neighbour function p...
I am trying to write my own function for scaling up an input image by using the Nearest-neighbor interpolation algorithm. The bad part is I am able to see how it works but cannot find the algorithm itself. I will be grateful for any help.
Here's what I tried for scaling up the input image by a factor of 2:
function output = nearest(inp...
I can use the median of medians selection algorithm to find the median in O(n). Also, I know that after the algorithm is done, all the elements to the left of the median are less that the median and all the elements to the right are greater than the median. But how do I find the k nearest neighbors to the median in O(n) time?
If the med...
On the wikipedia entry for k-d trees, an algorithm is presented for doing a nearest neighbor search on a k-d tree. What I don't understand is the explanation of step 3.2. How do you know there isn't a closer point just because the difference between the splitting coordinate of the search point and the current node is greater than the d...
I've been looking at building a 'people who like x, also like y' type recommendation system, and was looking at using Vogoo, but after looking through their code it seems there is a lot of nearest neighbor based on ratings.
Over the last few weeks I've seen a few articles stating that most people either don't rate at all, or rate a 5 h...
For finding the nearest neighbor, Space Partitioning is one of the algorithms. How does it work?
Suppose I have a 2D set of points (x and y coordinates), and I am given a point (a,b). How would this algorithm find out the nearest neighbor?
...
My Plain Code without interpolation:
im1 = imread('lena.jpg');imshow(im1);
[m,n,p]=size(im1);
thet = rand(1);
m1=m*cos(thet)+n*sin(thet);
n1=m*sin(thet)+n*cos(thet);
for i=1:m
for j=1:n
t = uint16((i-m/2)*cos(thet)-(j-n/2)*sin(thet)+m1/2);
s = uint16((i-m/2)*sin(thet)+(j-n/2)*cos(thet)+n1/2);
if t~=0...
This is in C.
I have two 2D arrays, ArrayA and ArrayB, that sample the same space. B samples a different attribute than ArrayA less frequently than ArrayA, so it is smaller than A.
Just to try to define some variables:
ArrayA: SizeAX by SizeAY, indexed by indexA for a position posAX, posAY
ArrayB: SizeBX by SizeAY, indexed by indexB...
I have to implement k nearest neighbors search for 10 dimensional data in kd-tree.
But problem is that my algorithm is very fast for k=1, but as much as 2000x slower for k>1 (k=2,5,10,20,100)
Is this normal for kd trees, or am I doing something worng?
...
Use case example: Client A comes to request sales information, enters their zip code and are directed to Representative X.
Since there is an effectively infinite number of zip codes there will not be an agent assigned to every single zip code which would then move out to the county level, and then so on into a region of counties and fin...
I have a database full of two-dimensional data - points on a map. Each record has a field of the geometry type. What I need to be able to do is pass a point to a stored procedure which returns the k nearest points (k would also be passed to the sproc, but that's easy). I've found a query at http://blogs.msdn.com/isaac/archive/2008/10/23/...
I'm currently attempting to find K Nearest Neighbor of all nodes of a balanced KD-Tree (with K=2).
My implementation is a variation of the code from the Wikipedia article and it's decently fast to find KNN of any node O(log N).
The problem lies with the fact that I need to find KNN of each node. Coming up with about O(N log N) if I ...