classification

SVM Classification - minimum number of input sets for each class

Im trying to build an app to detect images which are advertisements from the webpages. Once I detect those Ill not be allowing those to be displayed on the client side. From the help that I got here in stackoverflow, I thought SVM is the best approach to my aim. So, I have coded SVM and an SMO myself. The dataset which I have got from ...

Inter-rater agreement (Fleiss' Kappa, Krippendorff's Alpha etc) Java API?

I am working on building a Question Classification/Answering corpus as a part of my masters thesis. I'm looking at evaluating my expected answer type taxonomy with respect to inter-rater agreement/reliability, and I was wondering: Does anybody know of any decent (preferably free) Java API(s) that can do this? I'm reasonably certain all ...

Recommended anomaly detection technique for simple, one-dimensional scenario?

I have a scenario where I have several thousand instances of data. The data itself is represented as a single integer value. I want to be able to detect when an instance is an extreme outlier. For example, with the following example data: a = 10 b = 14 c = 25 d = 467 e = 12 d is clearly an anomaly, and I would want to perform a spec...

Classification of grocery list

I am looking to import my weekly grocery list into a spreadsheet. I know how to do so, however is there a standardized way to classify/decrypt the item codes and map them to actual items? Any pointers to any resources? Thanks in advance. ...

JAVA Network classification

What is network classification? Can u give me some example. Is it possible to write a network classification in Java and how about the appropriate library. ...

How to use a cross validation test with MATLAB?

I would like to use 10-fold Cross-validation to evaluate a discretization in MATLAB. I should first consider the attributes and the class column. ...

Detecting unknown class in a bayes classifier

If you have a bayes classifier trained for a set of classes, how to detect if the output is significant enough to choose a class? It would be useful for detecting samples wich can't be asigned to a class. I have tried testing if the class probability is above mean+2*stddev of the probabilities of all the clases, but I don't think it will...

Choose the right classification algorithm. Linear or non-linear?

Hi, I find this question a little tricky. Maybe someone knows an approach to answer this question. Imagine that you have a dataset(training data) which you don't know what it is about. Which features of training data would you look at in order to infer classification algorithm to classify this data? Can we say anything whether we should ...

Optimizing SMO with RBFKernel (C and gamma)

There are two parameters while using RBF kernels with Support Vector Machines: C and γ. It is not known beforehand which C and γ are the best for one problem; consequently some kind of model selection (parameter search) must be done. The goal is to identify good (C;γ) so that the classier can accurately predict unknown data (i.e., testin...

What algorithms are suitable for this simple machine learning problem?

I have a what I think is a simple machine learning question. Here is the basic problem: I am repeatedly given a new object and a list of descriptions about the object. For example: new_object: 'bob' new_object_descriptions: ['tall','old','funny']. I then have to use some kind of machine learning to find previously handled objects that h...

Issues in Convergence of Sequential minimal optimization for SVM

I have been working on Support Vector Machine for about 2 months now. I have coded SVM myself and for the optimization problem of SVM, I have used Sequential Minimal Optimization(SMO) by Mr. John Platt. Right now I am in the phase where I am going to grid search to find optimal C value for my dataset. ( Please find details of my project...

what is the best way to generate fake data for classification problem ?

i'm working on a project and i have a subset of user's key-stroke time data.This means that the user makes n attempts and i will use these recorded attempt time data in various kinds of classification algorithms for future user attempts to verify that the login process is done by the user or some another person. (Simply i can say that th...

SVM Visualization in MATLAB

How do I visualize the SVM classification once I perform SVM training in Matlab? ...

How to engineer features for machine learning

Do you have some advices or reading how to engineer features for a machine learning task? Good input features are important even for a neural network. The chosen features will affect the needed number of hidden neurons and the needed number of training examples. The following is an example problem, but I'm interested in feature engineer...

Using Artificial Intelligence (AI) to predict Stock Prices

Given a set of datavery similar to the Motley Fool CAPS system, where individual users enter BUY and SELL recommendations on various equities. What I would like to do is show each recommendation and I guess some how rate (1-5) as to whether it was good predictor<5> (ie corellation coeffient = 1) of the future stock price (or eps or what...

I want a machine to learn to categorize short texts

Hello, I have a ton of short stories about 500 words long and I want to categorize them into one of, let's say, 20 categories: Entertainment Food Music etc I can hand-classify a bunch of them, but I want to implement machine learning to guess the categories eventually. What's the best way to approach this? Is there a standard appro...

Beginner's resources/introductions to classification algorithms.

Hi, everybody. I am entirely new to the topic of classification algorithms, and need a few good pointers about where to start some "serious reading". I am right now in the process of finding out, whether machine learning and automated classification algorithms could be a worthwhile thing to add to some application of mine. I already sca...

Probability and Neural Networks

Is it a good practice to use sigmoid or tanh output layers in Neural networks directly to estimate probabilities? i.e the probability of given input to occur is the output of sigmoid function in the NN EDIT I wanted to use neural network to learn and predict the probability of a given input to occur.. You may consider the input as Stat...

Java text classification problem

Hello, I have a set of Books objects, classs Book is defined as following : Class Book{ String title; ArrayList<tags> taglist; } Where title is the title of the book, example : Javascript for dummies. and taglist is a list of tags for our example : Javascript, jquery, "web dev", .. As I said a have a set of books talking about di...

machine learning and code generator from strings

The problem: Given a set of hand categorized strings (or a set of ordered vectors of strings) generate a categorize function to categorize more input. In my case, that data (or most of it) is not natural language. The question: are there any tools out there that will do that? I'm thinking of some kind of reasonably polished, download, ...