classifier

Discrete and Continuous Classifier on Sparse Data

I'm trying to classify an example, which contains discrete and continuous features. Also, the example represents sparse data, so even though the system may have been trained on 100 features, the example may only have 12. What would be the best classifier algorithm to use to accomplish this? I've been looking at Bayes, Maxent, Decision T...

Sentiment analysis with NLTK python for sentences using sample data or webservice?

I am embarking upon a NLP project for sentiment analysis. I have successfully installed NLTK for python (seems like a great piece of software for this). However,I am having trouble understanding how it can be used to accomplish my task. Here is my task: I start with one long piece of data (lets say several hundred tweets on the subje...

Bag of words Classification

I need find words training words and their classification. Simple classification such as . Sports Entertainment and Politics things like that. Where Can i find the words and their classifications. I know many universities have done Bag of words classifications. Is there any repository of training examples ? ...

please help me to interpret the naive bayes result in weka..

Anybody please help me to interpret the following result generated in weka for classification using naive bayes.....Please explain clearly what is this Normal Distribution , Mean , StandardDev , WeightSum and Precision.Please help me.Am new in weka. ** Naive Bayes Classifier Class Normal: Prior probability = 0.5 1374195_at: Nor...

Finding the closest match

I Have an object with a set of parameters like: var obj = new {Param1 = 100; Param2 = 212; Param3 = 311; param4 = 11; Param5 = 290;} On the other side i have a list of object: var obj1 = new {Param1 = 1221; Param2 = 212; Param3 = 311; param4 = 11; Param5 = 290;} var obj3 = new {Param1 = 35; Param2 = 11; Param3 = 319; param4 = 211; Pa...

Rare Event Detection

Is there any good reference to Algorithms that people use for rare event detection ? Also, How is the time factor taken into account ? If i have a case where successive data points tell something (t_1 to t_n) , How can one factor this into normal Machine learning scenario ? Any pointer will be appreciated. ...

How to use NLP to separate a unstructured text content into distinct paragraphs ?

The following unstructured text has three distinct themes -- Stallone, Philadelphia and the American Revolution. But which algorithm or technique would you use to separate this content into distinct paragraphs? Classifiers won't work in this situation. I also tried to use Jaccard Similarity analyzer to find distance between successive ...

help with representing textual data in the format suitable for SVM's more specifically libsvm

Hi, My problem at hand is, I need to be able to classify agricultural web pages from not agricultural web pages. This is oriented towards building a focused crawler that only crawls and indexes mostly agricultural pages. I need advice from any person whose experienced with working with SVM's? Would considering the SVM classifier be appr...

Measuring rectangles at odd angles with a low resolution input matrix (Linear regression classification?)

I'm trying to solve the following problem: Given an input of, say, 0000000000000000 0011111111110000 0011111111110000 0011111111110000 0000000000000000 0000000111111110 0000000111111110 0000000000000000 I need to find the width and height of all rectangles in the field. The input is actually a single column at a time (think like a sc...

Incremental Decision Tree C++ Implementation

Do anyone know any incremental implementation of decision tree classifier. Such that it could generate optimal decision tree classifier when you add new instance to training set with low computation and as quick as possible according existing decision tree classifier? In other words I have an optimal decision tree classifier of set A, w...

PHP implementation of Bayes classificator: Assign topics to texts

In my news page project, I have a database table news with the following structure: - id: [integer] unique number identifying the news entry, e.g.: *1983* - title: [string] title of the text, e.g.: *New Life in America No Longer Means a New Name* - topic: [string] category which should be chosen by the classificator, e.g: *Sports* ...

Using my weka classifier in MOA

Hi, I have created my own classifier in weka and it works fine with the weka gui. I am trying to use it in MOA by choosing weka classifier and then my classifier. My classifier appears in the MOA gui under weka classifiers but if i choose it i get a "Problems with option: baseLearner" error. Is it not possible to use my new weka classifi...

Looking for open source naive Bayesian Classifier in C# for a Twitter sentiment analysis project.

I've found a similar project here: http://stackoverflow.com/questions/573768/sentiment-analysis-for-twitter-in-python . However, I'm working on C# and need to use a naive Bayesian Classifier that is open source in the same language. Unless someone can shed light on how I can utilize a python Bayesian Classifier to achieve the same goals....

Artifacts with classifiers not copying to local repository

I am using Maven version 2.0.7 and I am using the javadoc and source plugins to create additional artifacts for deploy. All of the generated artifacts are deploying correctly but it seems that when someone else builds they are only getting the specific artifact they specify. I don't want to have to add the source and javadoc artifacts as...