questions about machine-learning | ansaurus

machine-learning

Get recall (sensitivity) and precision (PPV) values of a multi-class problem in PyML

I am using PyML for SVM classification. However, I noticed that when I evaluate a multi-class classifier using LOO, the results object does not report the sensitivity and PPV values. Instead they are 0.0: from PyML import * from PyML.classifiers import multi mc = multi.OneAgainstRest(SVM()) data = VectorDataSet('iris.data', labelsColum...

machine-learning

A question about classifiers in Machine Learning

Hi All,I am taking classes on intro to AI,and the teacher mentioned some point that for the classifier ZeroR,the accuracy under ZeroR is a helpful baseline for interpreting other classifiers. I searched online about this but still couldn't get my head around it,could anyone give some idea on what that means please,thanks in advance. ...

artificial-intelligence

machine-learning

Alternatives (or ways to speed up) Acts_As_Recommendable plugin for Ruby on Rails

Hi all- I am currently using the Acts_as_recommendable plugin available here. It is using the pearson correlation coefficient to find recommendations, which is pretty much exactly what I want. The problem however is scale. With more than 2000 or so items, the plugin slows considerably (with 5000 items, I see load times of about a min...

machine-learning

recommendation-engine

Using r and weka. How can I use meta-algorithms along with nfold evaluation method?

Here is an example of my problem library(RWeka) iris <- read.arff("iris.arff") Perform nfolds to obtain the proper accuracy of the classifier. m<-J48(class~., data=iris) e<-evaluate_Weka_classifier(m,numFolds = 5) summary(e) The results provided here are obtained by building the model with a part of the dataset and testing it with ...

machine-learning

Algorithm for text classification

Hello. I have millions of short (up to 30 words) documents which I need to split into several known categories. It's possible, that a document matches several of the categories (seldom, but possible). It's also possible that a document doesn't match any of the categories (also seldom). I also have millions of documents which have already...

artificial-intelligence

machine-learning

text-processing

Please help me on choosing right classifer

Hi all, I am facing a problem on selecting correct classifier for my data-mining task. I am labeling webpages using statistical method and label them using a 1-4 scale,1 being the poorest while 4 being the best. Previously,I used SVM to train the system since I was using a binary(1,0) label then.But now since I switch to this 4-class ...

artificial-intelligence

machine-learning

Image classification in python

I'm looking for a method of classifying scanned pages that consist largely of text. Here are the particulars of my problem. I have a large collection of scanned documents and need to detect the presence of certain kinds of pages within these documents. I plan to "burst" the documents into their component pages (each of which is an ind...

image-processing

machine-learning

barcode-scanner

is this classification result acceptable?

Hi all, I have a very simple linear classification problem,which is to work out a linear classification problem for the following three classes in coordinates: Class 1: points (0,1) (1,0) Class 2: points (-1,0) (1,0) Class 3: points (0,-1) (1,-1) I manually used a random initial weight [ 1 0,0 1] (2*2 matrix) and a random initial bias...

artificial-intelligence

machine-learning

Untrained Sentiment Analysis

Hi, I've been reading alot of articles that explain the need for an initial set of texts that are classified as either 'positive' or 'negative' before a sentiment analysis system will really work. My question is: Has anyone attempted just doing a rudimentary check of 'positive' adjectives vs 'negative' adjectives, taking into account a...

machine-learning

natural-language

Are there any open source Hierarchical Temporal Memory libraries?

I'm potentitally interested in the using Hierarchical temporal memory heuristic to solve a research problem I am working on. Some more details about it can be found here: http://en.wikipedia.org/wiki/Hierarchical_temporal_memory Are there any open source libraries for this? (I'm fairly open to languages although c++, java or haskell is ...

machine-learning

Is there a java alternative to the Bayesian Belief Network Framework "Infer.NET"?

Is the are java alternative to Bayesian Belief Network framework - Infer.NET? Preferable if it be scalable(online learning for large datasets), well-supported(last updated since 2010) and open source and easy to write network structure. So all features from Infer.NET. ...

machine-learning

bayesian-networks

belief-propagation

Can an SVM learn incrementally?

I am using a multi-dimensional SVM classifier (SVM.NET, a wrapper for libSVM) to classify a set of features. Given an SVM model, is it possible to incorporate new training data without having to recalculate on all previous data? I guess another way of putting it would be: is an SVM mutable? ...

machine-learning

Is there some .NET machine learning library that could, for example, suggest tags for a question?

Just to use it as an example, StackOverflow users already associated tags to questions for a lot of questions. Is there a .NET machine learning library that could use this historic data to 'learn' how to associate tags to newly created questions and suggest them to the user? ...

machine-learning

The effect of Decision Tree Pruning

Hi all,I want to know if I build up a decision tree A like ID3 from training and validation set,but A is unpruned. At the same time,I have another decision tree B also in ID3 generated from the same training and validation set,but B is pruned. Now I test both A and B on a future unlabeled test set,is it always the case that pruned tree w...

artificial-intelligence

machine-learning

Help me understand linear separability in a binary SVM

I'm cross-posting this from math.stackexchange.com because I'm not getting any feedback and it's a time-sensitive question for me. My question pertains to linear separability with hyperplanes in a support vector machine. According to Wikipedia: ...formally, a support vector machine constructs a hyperplane or set of hyperplane...

machine-learning

Laplacian smoothing to Biopython

Hi, I am trying to add Laplacian smoothing support to Biopython's Naive Bayes code 1 for my Bioinformatics project. I have read many documents about Naive Bayes algorithm and Laplacian smoothing and I think I got the basic idea but I just can't integrate this with that code (actually I cannot see which part I will add 1 -laplacian num...

machine-learning

Medical information extraction using Python

Hello there, I am a nurse and I know python but I am not an expert, just used it to process DNA sequences We got hospital records written in human languages and I am supposed to insert these data into a database or csv file but they are more than 5000 lines and this can be so hard. All the data are written in a consistent format let me s...

machine-learning

information-extraction

Mass Point, Dirac Delta in Dirichlet Processes

When dealing with Dirichlet Processes, according to [Teh, 2007], a DP is defined as by a base Probability H and a scale factor "alpha" According to the Stick Breaking Construction, the random draws G from a DP: G~DP(alpha,H) are given by: G=sum(pi_k*delta_theta_k) over k from 1 to infinity pi_k are ordered draws from a Beta Distribu...

machine-learning

MLE for Naive Bayes in R

i am using naivebayes function of e1071 library of R like below: model <- naiveBayes(Species ~ ., data = iris) pred <- predict(model, iris[,]) my question is: how can i get maximum likelihood estimate for conditional probability distibution of this model? ...

machine-learning

C++ decision tree with pruning

Hello. Can you recommend me a good decision tree C++ class with support for continous features and pruning(its very important)? Im writing a simple classifier(two classes) using 9 features. I've been using Waffles recently, but looks like tree is overfitting so i get Precision around 82% but Recall is around 51% which is inacceptable. Wa...

machine-learning

1
...
15
16
17
18
19