machine-learning

Cluster and rank blogs by logical categories

What kind of algorithm would be good to cluster and rank blogs in logical communities (tech, entertainment, etc...)? An algorithm to cluster and rank blog posts would be even better. Answers accepted are algorithms, pseudo-code, java code or links to explanations on particular algorithms. Update: So, it seems I would like something i...

Support vector machines - separating hyperplane question

From what I've seen, seems like the separation hyperplane must be in the form x.w + b = 0. I don't get very well this notation. From what I understand, x.w is a inner product, so it's result will be a scalar. How can be it that you can represent a hyperplane by a scalar + b? I'm quite confused with this. Also, even if it was x + b ...

Question About VC Dimension

If I have the input space of (1,2,....999). And I have a concept class C, with 10 concepts: C0,C1,C2...C9. Given an input, that input is an element of ci if the it contains the digit i. For example, the number 123 is an element of c1 and c2 and c3. What is the VC Dimension of this concept class C? ...

What problems have you solved using artificial neural networks?

I'd like to know about specific problems you - the SO reader - have solved using artificial neural network techniques and what libraries/frameworks you used if you didn't roll your own. Questions: What problems have you used artificial neural networks to solve? What libraries/frameworks did you use? I'm looking for first-hand exper...

Approximating function with Neural Network

I am trying to approximate the sine() function using a neural network I wrote myself. I have tested my neural network on a simple OCR problem already and it worked, but I am having trouble applying it to approximate sine(). My problem is that during training my error converges on exactly 50%, so I'm guessing it's completely random. I am...

What programs should I learn to be able to do computational modeling?

I've got some free time and I'm looking to learn a programming language or two that I can use for computational modeling (I'm in cognitive science & psychology). I'm not sure if I'll end up doing neural nets, machine learning, AI, or something altogether different, so I'm just looking for a good, broad base to start with, like a nudge in...

Algorithms to find stuff a user would like based on other users likes

I'm thinking of writing an app to classify movies in an HTPC based on what the family members like. I don't know statistics or AI, but the stuff here looks very juicy. I wouldn't know where to start do. Here's what I want to accomplish: Compose a set of samples from each users likes, rating each sample attribute separately. For examp...

How to test the quality of a probabilities estimator?

I created a heuristic (an ANN, but that's not important) to estimate the probabilities of an event (the results of sports games, but that's not important either). Given some inputs, this heuristics tell me what are the probabilities of the event. Something like : Given theses inputs, team B as 65% chances to win. I have a large set of i...

machine learning libraries in C#

Hello. Are there any machine learning libraries in C#? I'm after something like WEKA. Thank you. ...

How to set output size in Matlab newff method

Hi. Summary: I'm trying to do classification of some images depending on the angles between body parts. I assume that human body consists of 10 parts(as rectangles) and find the center of each part and calculate the angle of each part by reference to torso. And I have three action categories:Handwave-Walking-Running. My goal is to fi...

categorizing friends in social networks

I'm facing tho following problem: let's say u is a social network user and as such has a list of friends, F(u). a partition is a function F->G, where G is a set of groups such as High-school, university, work, etc'. I need to come up with algorithm to partite F: the input is F and also F(f) for every f in F (the list of friends for eac...

MATLAB: help needed with Self-Organizing Map (SOM) clustering

I'm trying to cluster some images depending on the angles between body parts. The features extracted from each image are: angle1 : torso - torso angle2 : torso - upper left arm .. angle10: torso - lower right foot Therefore the input data is a matrix of size 1057x10, where 1057 stands for the number of images, and 10 stands for angle...

How do I efficiently estimate a probability based on a small amount of evidence?

I've been trying to find an answer to this for months (to be used in a machine learning application), it doesn't seem like it should be a terribly hard problem, but I'm a software engineer, and math was never one of my strengths. Here is the scenario: I have a (possibly) unevenly weighted coin and I want to figure out the probability o...

Help with Perceptron

Here is my perceptron implementation in ANSI C: #include <stdio.h> #include <stdlib.h> #include <math.h> float randomFloat() { srand(time(NULL)); float r = (float)rand() / (float)RAND_MAX; return r; } int calculateOutput(float weights[], float x, float y) { float sum = x * weights[0] + y * weights[1]; return (sum >...

Know any good c++ support vector machine (SVM) libraries ?

Hey everyone, Do you know of any good c++ svm libraries out there ? I tried libsvm (http://www.csie.ntu.edu.tw/~cjlin/libsvm/) but so far I'm not flabbergasted (no documentation, or close to none). I have also heard of SVMLight and TinySVM. Have you tried them ? Any new players ? Thanks ! JC ...

How to detect tabular data from a variety of sources

In an experimental project I am playing with I want to be able to look at textual data and detect whether it contains data in a tabular format. Of course there are a lot of cases that could look like tabular data, so I was wondering what sort of algorithm I'd need to research to look for common features. My first thought was to write a ...

How to change the default parameters for newfit() in MATLAB?

I am using net = newfit(in,out,lag(j),{'tansig','tansig'}); to generate a new neural network. The default value of the number of validation checks is 6. I am training a lot of networks and this is taking a lot of time. I guess it doesn't matter if my results are a bit less accurate if they can be made considerably faster. How can ...

Competitive Learning in Neural Networks

I am playing with some neural network simulations. I'd like to get two neural networks sharing the input and output nodes (with other nodes being distinct and part of two different routes) to compete. Are there any examples/standard algorithms I should look at? Is this an appropriate question for this site? Right now I'm using a thresho...

libsvm model file format

According to this FAQ the model format in libsvm should be straightforward. And in fact it is, when I call just svm-train. As an example, the first SV for the a1a dataset is 1 3:1 11:1 14:1 19:1 39:1 42:1 55:1 64:1 67:1 73:1 75:1 76:1 80:1 83:1 On the other hand, if I use the easy.py script, my first SV ends up being: 512 1:-1 2:-1...

Unsupervised classification methods available

Hi all I'm doing a research which involves "unsupervised classification". Basically I have a trainSet and I want to cluster data in X number of classes in unsupervised way. Idea is similar to what k-means does. Let's say Step1) featureSet is a [1057x10] matrice and I want to cluster them into 88 clusters. Step2) Use previously cal...