machine-learning

who have used PedestrianDetectionHoG_NET2005 I need to know the configure process

this object detect soft ware is good to detect object and recognize them. I want to learn it, I download it and download the lib it required such as blitz0.9 boost1.41 and opencv 2.0,but when I run the software it have too many problem . I think maybe the configure is not right that cause the errors,I want to know who have run the softwa...

Understanding the Neural Network Backpropagation

Update: a better formulation of the issue. I'm trying to understand the backpropagation algorithm with an XOR neural network as an example. For this case there are 2 input neurons + 1 bias, 2 neurons in the hidden layer + 1 bias, and 1 output neuron. A B A XOR B 1 1 -1 1 -1 1 -1 1 1 -1 -1 -1 I'm using the...

Detecting patterns in waves

Hello all! I'm trying to read a image from a electrocardiography and detect each one of the main waves in it (P wave, QRS complex and T wave). Now I can read the image and get a vector like (4.2; 4.4; 4.9; 4.7; ...) representative of the values in the electrocardiography, what is half of the problem. I need a algorithm that can walk thr...

Machine Learning and Natural Language Processing

Assume you know a student who wants to study Machine Learning and Natural Language Processing. What introductory subjects would you recommend? Example: I'm guessing that knowing Prolog and Matlab might help him. He also might want to study Discrete Structures*, Calculus, and Statistics. *Graphs and trees. Functions: properties, recur...

Unit Testing Machine Learning Code

I am writing a fairly complicated machine learning program for my thesis in computer vision. It's working fairly well, but I need to keep trying out new things out and adding new functionality. This is problematic because I sometimes introduce bugs when I am extending the code or trying to simplify an algorithm. Clearly the correct thin...

How to approach machine learning problems with high dimensional input space?

How should I approach a situtation when I try to apply some ML algorithm (classification, to be more specific, SVM in particular) over some high dimensional input, and the results I get are not quite satisfactory? 1, 2 or 3 dimensional data can be visualized, along with the algorithm's results, so you can get the hang of what's going on...

Classifying type samples from image files

Which approach would you suggest for automatically classifying type found in images? The samples are likely large, with black text on a white background. The categories are defined here, with some examples on each (Google Books link): http://bit.ly/9Mnu7P This is an extended version of the VOX-ATypI classification system. My initial t...

Machine learning in OCaml or Haskell?

I'm hoping to use either Haskell or OCaml on a new project because R is too slow. I need to be able to use support vectory machines, ideally separating out each execution to run in parallel. I want to use a functional language and I have the feeling that these two are the best so far as performance and elegance are concerned (I like Cl...

training time and overfitting with gamma and C in libsvm

Hi, I am now using libsvm for support vector machine classifier with Gaussian kernal. In its website, it provides a python script grid.py to select the best C and gamma. I just wonder how training time and overfitting/underfitting change with gamma and C? Is it correct that: suppose C changes from 0 to +infinity, the trained model wi...

SVM Classification - minimum number of input sets for each class

Im trying to build an app to detect images which are advertisements from the webpages. Once I detect those Ill not be allowing those to be displayed on the client side. From the help that I got here in stackoverflow, I thought SVM is the best approach to my aim. So, I have coded SVM and an SMO myself. The dataset which I have got from ...

what this python code trying to do

Hi, The following python code is to traverse a 2D grid of (c, g) in some special order, which is stored in "jobs" and "job_queue". But I am not sure which kind of order it is after trying to understand the code. Is someone able to tell about the order and give some explanation for the purpose of each function? Thanks and regards! impo...

Inter-rater agreement (Fleiss' Kappa, Krippendorff's Alpha etc) Java API?

I am working on building a Question Classification/Answering corpus as a part of my masters thesis. I'm looking at evaluating my expected answer type taxonomy with respect to inter-rater agreement/reliability, and I was wondering: Does anybody know of any decent (preferably free) Java API(s) that can do this? I'm reasonably certain all ...

WEKA Tutorials / Examples for a Newbie

In a follow-up to this answer I want to ask if any of you know any good (and more importantly easy to understand) tutorials and / or examples of data mining with the Weka toolkit. I've been very interested in Data Mining ever since I've first heard of it and the things it can do, I've also have some experiments I'd like to do with some ...

Algorithm for deviations

Hi! I have to track if given a week full of data integers ( 40, 30, 25, 55, 5, 40, etc ) raise an alert when the deviation from the norm happens (the '5' in the above case). An extra nice thing to have would be to actually learn if 5 is a normal event for that day of the week. Do you know an implementation in ruby that is meant for th...

Workflow for developing number crunching applications on amazon ec2/S3

Much has been written about deploying data crunching applications on EC2/S3, but I would like to know, what is the typical workflow for developing such applications? Lets say I have a 1 TB of time series data to begin with and I have managed to store this on S3. How would I write applications and do interactive data analysis to build m...

Best Java Open Source Text Mining Framework

Hello Everyone, I want to know what is the best open source java based framework for Text Mining, to use botg Machine Learning and dictionary Methods. I'm using Mallet but there are not that much documentation and I do not know if it will fit all my requirements. Thanks in advance. Best Regards, ukrania ...

Recommended anomaly detection technique for simple, one-dimensional scenario?

I have a scenario where I have several thousand instances of data. The data itself is represented as a single integer value. I want to be able to detect when an instance is an extreme outlier. For example, with the following example data: a = 10 b = 14 c = 25 d = 467 e = 12 d is clearly an anomaly, and I would want to perform a spec...

First Order Logic Engine

I'd like to create an application that can do simple reasoning using first order logic. Can anyone recommend an "engine" that can accept an arbitrary number of FOL expressions, and allow querying of those expressions (preferably accessible via Python)? ...

Help Understanding Cross Validation and Decision Trees

I've been reading up on Decision Trees and Cross Validation, and I understand both concepts. However, I'm having trouble understanding Cross Validation as it pertains to Decision Trees. Essentially Cross Validation allows you to alternate between training and testing when your dataset is relatively small to maximize your error estimation...

How to use a cross validation test with MATLAB?

I would like to use 10-fold Cross-validation to evaluate a discretization in MATLAB. I should first consider the attributes and the class column. ...