weka

What is the meaning of jitter in visualize tab of weka

In weka I load an arff file. I can view the relationship between attributes using the visualize tab. However I can't understand the meaning of the jitter slider. What is its purpose? ...

Question About Using Weka, the machine learning tool

I'm using the explorer feature of Weka for classification. So I have my .arff file, with 2 features of NUMERIC value, and my class is a binary 0 or 1 (eg {0,1}). Sample: @RELATION summary @ATTRIBUTE feature1 NUMERIC @ATTRIBUTE feature2 NUMERIC @ATTRIBUTE class {1,0} @DATA 23,11,0 20,100,1 2,36,0 98,8,1 ..... I load this .arff file,...

Resample Filter of WEKA - How to interpret the result

Dear all, I am currently strugeling with a machine learning problem whereas I have to deal with great unbalanced data sets. That is, there are six classes ('1','2'...'6'). Unfortunately there are e.g. for class '1' 150 examples/instances, for '2' 90 instances and for class '3' only 20. All other classes can't be "trained" since there ar...

Simulation in WEKA

Can Weka do simulations? I need to transform random subsets of values for a particular attribute and re-do analysis to figure out the variability for a particular coefficient. I guess I don't need Weka per se, just a Java implementation where I can specify the column and transform random subsets of the values repeatedly. ...

Weka normalizing columns

Hi all, I have an ARFF file containing 14 numerical columns. I want to perform a normalization on each column separately, that is modifying the values from each colum to (actual_value - min(this_column)) / (max(this_column) - min(this_column)). Hence, all values from a column will be in the range [0, 1]. The min and max values from a co...

WEKA Tutorials / Examples for a Newbie

In a follow-up to this answer I want to ask if any of you know any good (and more importantly easy to understand) tutorials and / or examples of data mining with the Weka toolkit. I've been very interested in Data Mining ever since I've first heard of it and the things it can do, I've also have some experiments I'd like to do with some ...

java program that calls weka functionalities

Please show a java program to load an arff file into weka and then rank it using InfoGainEvalAttribute . I need to incorporate the weka functionalities in the java program. Please help me out with the program....... ...

Choose the right classification algorithm. Linear or non-linear?

Hi, I find this question a little tricky. Maybe someone knows an approach to answer this question. Imagine that you have a dataset(training data) which you don't know what it is about. Which features of training data would you look at in order to infer classification algorithm to classify this data? Can we say anything whether we should ...

Optimizing SMO with RBFKernel (C and gamma)

There are two parameters while using RBF kernels with Support Vector Machines: C and γ. It is not known beforehand which C and γ are the best for one problem; consequently some kind of model selection (parameter search) must be done. The goal is to identify good (C;γ) so that the classier can accurately predict unknown data (i.e., testin...

Where can I find FuzzyGK or other fuzzy clustering algorithms to use in Weka?

I'm learning Weka and I'm trying to figure out how to do fuzzy clustering. I found an old site that listed 2 fuzzy clusterers (by Frank Weber & Robin Senge), but I could not add their .jar file to the existing .jar file to use the algorithms. Does anyone know where I can find fuzzy clustering algorithms for Weka? If not, is there an...

How to Debug Weka?

Hi, I am trying to implement a new filter for Weka. I would like to know, what should i do to be able to debug weka, so that I can see what's wrong with my code, since when I try to run the filter in weka I am getting exceptions. Currently I am using JOptionPane.showMessageDialog(null, ...); to print the values of variables, to try ...

what is the best way to generate fake data for classification problem ?

i'm working on a project and i have a subset of user's key-stroke time data.This means that the user makes n attempts and i will use these recorded attempt time data in various kinds of classification algorithms for future user attempts to verify that the login process is done by the user or some another person. (Simply i can say that th...

Beginner's resources/introductions to classification algorithms.

Hi, everybody. I am entirely new to the topic of classification algorithms, and need a few good pointers about where to start some "serious reading". I am right now in the process of finding out, whether machine learning and automated classification algorithms could be a worthwhile thing to add to some application of mine. I already sca...

Text mining with PHP

Hi, I'm doing a project for a college class I'm taking. I'm using PHP to build a simple web app that classify tweets as "positive" (or happy) and "negative" (or sad) based on a set of dictionaries. The algorithm I'm thinking of right now is Naive Bayes classifier or decision tree. However, I can't find any PHP library that helps me do...

How to import XML files in WEKA

I want to import a bunch of xml data in weka. Is there a straightforward solution or a tutorial or I have to maually convert it to a csv or arff file format? ...

Sentiment analysis with NLTK python for sentences using sample data or webservice?

I am embarking upon a NLP project for sentiment analysis. I have successfully installed NLTK for python (seems like a great piece of software for this). However,I am having trouble understanding how it can be used to accomplish my task. Here is my task: I start with one long piece of data (lets say several hundred tweets on the subje...

How to interpret weka classification?

How can we interpret the classification result in weka using naive bayes? How is mean, std deviation, weight sum and precision calculated? How is kappa statistic, mean absolute error, root mean squared error etc calculated? What is the interpretation of the confusion matrix? ...

please help me to interpret the naive bayes result in weka..

Anybody please help me to interpret the following result generated in weka for classification using naive bayes.....Please explain clearly what is this Normal Distribution , Mean , StandardDev , WeightSum and Precision.Please help me.Am new in weka. ** Naive Bayes Classifier Class Normal: Prior probability = 0.5 1374195_at: Nor...

Filtering Attributes with Weka

Hi eveyone! I have a simple question about filtering attributes in WEKA. Let's say I have 500 attributes 30 classes and 100 samples for each class which equals 3000 rows and 500 columns. This causes time and memory problems a you can guess. How do I filter attributes that occur only once or twice (or n times) in 3000 rows. And is it a...

Using MOA to classify new examples?

I'm trying to use the java machine learning library MOA to train on a training data stream, then predict classes for a test data stream. The first part works fine, using (for example) java -cp .:moa.jar:weka.jar -javaagent:sizeofag.jar moa.DoTask "LearnModel -l MajorityClass -s (ArffFileStream -f atrain.arff -c -1) -O amodel.moa" But ...