weka

How to purposely overfit Weka tree classifiers?

I have a binary class dataset (0 / 1) with a large skew towards the "0" class (about 30000 vs 1500). There are 7 features for each instance, no missing values. When I use the J48 or any other tree classifier, I get almost all of the "1" instances misclassified as "0". Setting the classifier to "unpruned", setting minimum number of inst...

Eclipse - Setting .classpath file for existing project

I have a java project. The working folder from someone else's Eclipse project (It was a Repast Simphony project I think). In my eclipse I created a new Java project and told it to use the existing code. So it seems to have brought in all the code. However after loading the project I get this error: Project 'My Project' is missing re...

Reusing Weka Code to Parse ARFF Files

Has anyone done this? Is there any documentation on how to use this parser module? I've looked through the code but it's not clear to me to how to actually use the data after it's been parsed. The file src\main\java\weka\core\converters\ArffLoader.java (which I assume is where the Arff parsing happens) has these instructions: Typica...

Strange weka instance results

Hi, strange results come up while using a J48 tree. I need to classify a vector of 48 features, which works very well, but when i tried to "optimize", I run into strange results. I have a method classify: public boolean classify(double feature1, double feature2, double[] featureVec ) { Instance toBeClassified = new Instanc...

NetBeans - How to import a class from an external library

I have a Java project in Netbeans and I want to use some classes from Weka within my project. I added the file C:\Program Files\Weka-3-7\weka-src.jar into my Libraries following the instructions here (project, properties, libraries ..) So how do I now import the classes I want? I tried importing like this: import weka.core.converters...

why WEKA Randam Forest's accuracy end up 98% although changes in no of trees

I've built Webpage classification system with Random Forest from WEKA. I don't known why WEKA Randam Forest's accuracy end up 98% although changes in no of trees. I am still novice in this field, I also would like to know how accuray calculate in it. ...

Porter Stemmer and Weka

I am using Weka with the porter Stemmer provided in the SnowBall package. Everything works fine if I run my application within Eclipse, but as soon as I export it as runnable jar (With all the libraries included) weka says: Stemmer 'porter' unknown! How could I fix that? ...

How ist the bandwith calculated in Weka KernelEstimator class?

I am using Weka to claclculate the probability for a given dataset. More specifically I am using the KernelEstimator class. For good density estimation results the choice of the bandwith parameter is crutial, but i have not been able to find out how the bandwith parameter is calculated. The kernel function being used is a simple Gaussian...

Using Weka Java Code - How Convert CSV (without header row) to ARFF Format?

I'm using the Weka Java library to read in a CSV file and convert it to an ARFF file. The problem is that the CSV file doesn't have a header row, only data. How do I assign attribute names after I bring in the CSV file? (all the columns would be string data types) Here is the code I have so far: CSVLoader loader = new CSVLoader(...

How to export a clustering result from Weka

Hi I'm new to Weka, using it to analyse the user attributes based on user ID. the raw data may looks like this, [userid->game coin] 10001-> 100 10002-> 501 ... i am trying to do a K-Mean Clustering on [game coin] and sort the data into some groups and, is it possible to save the sorted [userid] results, just as some non-overlapped c...

Using my weka classifier in MOA

Hi, I have created my own classifier in weka and it works fine with the weka gui. I am trying to use it in MOA by choosing weka classifier and then my classifier. My classifier appears in the MOA gui under weka classifiers but if i choose it i get a "Problems with option: baseLearner" error. Is it not possible to use my new weka classifi...

interpreting Naive Bayes results

i start using NaiveBayes/Simple classifier for classification (Weka), however i have some problems to understand while training the data. The data set i'm using is weather.nominal.arff. While i use use training test from the options, the classifier result is : Correctly Classified Instances 13 - 92.8571 % Incorrectly Classif...

Classifier performance on subset of data

I'm using Weka to perform classification on a set of labelled web pages, and measuring classifier performance with AUC. I have a separate six-level factor that is not used in classification, and I'd like to know how well classifiers perform on each level of the factor. What techniques or measures should I use to test classifier performa...

beginner question on investigating on samples in Weka

Hello there, I've just used Weka to train my SVM classifier under "Classify" tag. Now I want to further investigate which data samples are mis-classified,I need to study their pattern,but I don't know where to look at this from Weka. Could anyone give me some help please? Thanks in advance. ...

How to determine the most informative feature in a tree learned by Weka

Hi there. I used the weka to train a J48 classifier,and it returned a textual representation of tree. Now if I want to determine which feature is the most informative,how could I proceed?Any idea is welcomed. Thanks in advance. ...

Using r and weka. How can I use meta-algorithms along with nfold evaluation method?

Here is an example of my problem library(RWeka) iris <- read.arff("iris.arff") Perform nfolds to obtain the proper accuracy of the classifier. m<-J48(class~., data=iris) e<-evaluate_Weka_classifier(m,numFolds = 5) summary(e) The results provided here are obtained by building the model with a part of the dataset and testing it with ...

Attribute Selection in Weka

Hi! Everyone. I am a new student in Data Mining World.I need some assitance in my Data Mining project. Scenario : I have 64 attributes in my data set. I have been asked to use Weka in my project. My question is which attribute selection algorithm should I use in Weka in order to extract the features ? There are several subset evaluat...