k-means

K-means clustering: What's wrong? (PHP)

Hello! I was looking for a way to calculate dynamic market values in a soccer manager game. I asked this question here and got a very good answer from Alceu Costa. I tried to code this algorithm (90 elements, 5 clustes) but it doesn't work correctly: In the first iteration, a high percentage of the elements changes its cluster. From ...

Matlab:K-means clustering

I have a matrice of A(369x10) which I want to cluster in 19 clusters. I use this method [idx ctrs]=kmeans(A,19) which yields idx(369x1) and ctrs(19x10) I get the point up to here.All my rows in A is clustered in 19 clusters. Now I have an array B(49x10).I want to know where the rows of this B corresponds in the among given 19 cluste...

Python k-means algorithm

I am looking for Python implementation of k-means algorithm with examples to cluster and cache my database of coordinates. ...

OpenCV K-Means (kmeans2)

Hi, I'm using Opencv's K-means implementation to cluster a large set of 8-dimensional vectors. They cluster fine, but I can't find any way to see the prototypes created by the clustering process. Is this even possible? OpenCV only seems to give access to the cluster indexes (or labels). If not I guess it'll be time to make my own imple...

How do I determine k when using k-means clustering?

I've been studying about k-means clustering, and one thing that's not clear is how you choose the value of k. Is it just a matter of trial and error, or is there more to it? ...

mahout lucene document clustering howto?

I'm reading that i can create mahout vectors from a lucene index that can be used to apply the mahout clustering algorithms. http://cwiki.apache.org/confluence/display/MAHOUT/Creating+Vectors+from+Text I would like to apply K-means clustering algorithm in the documents in my Lucene index, but it is not clear how can i apply this algorit...

OpenCV's clustering function cvKMeans2() - what is a type of cluster center in array?

I'm using function cvKMeans2() from OpenCV library for clustering. It has optional parametr: centers - The optional output array of the cluster centers The same parametr is also in function kmeans(). I want to know informations about clusters. But I haven't found what is a type of that cluster center in array, so I can't get it. Thanks...

Finding the spread of each cluster from Kmeans

Hello, I'm trying to detect how well an input vector fits a given cluster centre. I can find the best match quite easily (the centre with the minimum euclidean distance to the input vector is the best), however, I now need to work how good a match that is. To do this I need to find the spread (standard deviation?) of the vectors which ...

whats is the difference between "k means" and "fuzzy c means" objective functions?

I am trying to see if the performance of both can be compared based on the objective functions they work on? ...

how to implement k-means for simple grouping in java

Hi all, I would like to know simple k-means algorithm in java. I want to use k-means only for grouping one dimensional array not multi. For example, before grouping the array consists of 2,4,7,5,12,34,18,25 if we want four group then we got group 1: 2,4,5 group 2: 7,12 group 3: 18,25 group 4: 34 ...

OpenCV's cvKMeans2 - chosing clusters

Hi, I'm using cvKMeans2 for clustering, but I'm not sure, how it works in general - the part of choosing clusters. I thought that it set the first positions of clusters from given samples. So it means that in the end of clustering process would every cluster has at least one sample -> in the output array of cluster labels will be full ra...

K-Means Algorithm and java code

Hi all, I need to calculate for grouping objects according to their size. I got k-means algorithms in java which calculate mostly for classifying according to their two or more features and the results are not satisfy for me.I only want to calculate for grouping objects based on one feature.Pseudocode or code would be helpful, too. Than...

C# - Data Clustering approach

Hi all, I am writing a program in C# in which I have a set of 200 points displayed on an image. However, the points tend to cluster in various regions, and I am looking to find a way to "cluster." In other words, maybe draw a circle/ellipse around the clustered points. Has anyone seen any way to do this? I have heard about K-means clu...

clustering on very large sparse matrix?

Hello again, I am trying to do some (k-means) clustering on a very large matrix. The matrix is approximately 500000 rows x 4000 cols yet very sparse (only a couple of "1" values per row). I want to get around 2000 clusters. I got two questions: - Can someone recommend an open source platform or tool for doing that (maybe using k-means...

implementing complex algorithms on database stored information

Hey, I'm trying to figure out the best practice for implementing a complex algorithm on stored information in a relational DB. Specifically: I want to implement a variation of the k-means algorithm (a document clustering algorithm) on a large MS SQL Server database containing TFxIDF vectors of many documents (these vectors are used as ...

How to cluster time series data using K-means algorithm?

Hi, I am wondering how can I do clustering of time series data. I understand if the data is a point. But I do not know how to cluster if the data is time series with 1XM where M is the data length. Especially the part on how to compute new mean of the cluster for time series data. My X matrix will be N X M where N is number of time ser...

MATLAB kMeans does not always converge to global minima

I wrote a k-Means clustering algorithm in MATLAB, and I thought I'd try it against MATLABs built in kmeans(X,k). However, for the very easy four cluster setup (see picture), MATLAB kMeans does not always converge to the optimum solution (left) but to (right). The one I wrote does not always do that either, but should not the built-in f...

Online k-means clustering

Is there a online version of the k-Means clustering algorithm? By online I mean that every data point is processed in serial, one at a time as they enter the system, hence saving computing time when used in real time. I have wrote one my self with good results, but I would really prefer to have something "standardized" to refer to, sin...

Using a smoother with the L Method to determine the number of K-Means clusters

Has anyone tried to apply a smoother to the evaluation metric before applying the L-method to determine the number of k-means clusters in a dataset? If so, did it improve the results? Or allow a lower number of k-means trials and hence much greater increase in speed? Which smoothing algorithm/method did you use? The "L-Method" is deta...