machine-learning

How to calculate the slope of noisy time series data

I have a process that consumes multiple sources of live price data from the forex market and produces 2 streams of time series data as its output. The output is noisy (i.e. not smooth like sin or cos), and both streams are bound between the values of 0 and 100. Is there an approach in machine learning or AI that can help me identify wh...

Order-issuing neural network?

Hello. I'm interested in writing certain software that uses machine learning, and performs certain actions based on external data. However I've run into problem (that was always interesting to me) - how is it possible to write machine learning software that issues orders or sequences of orders? The problem is that as I understand it...

Entry level posts for machine-learning

Hi, I am looking for some entry level posts about machine learning. Can anyone suggest anything for somebody new to this subject? Cheers Paul ...

Ordinal classification packages and algorithms

I'm attempting to make a classifier that chooses a rating (1-5) for a item i. For each item i, I have a vector x containing about 40 different quantities pertaining to i. I also have a gold standard rating for each item. Based on some function of x, I want to train a classifier to give me a rating 1-5 that closely matches the gold sta...

Neural Network size for Animation system

I decided to go with a Neural Network in order to create behaviors for an animation engine that I have. The neural network takes in 3 vector3s and 1 Euler angle for every body part that I have. The first vector3 is the position, the second is its velocity, and the third is its angular velocity. The Euler angle is what rotation the body p...

Ontological databases for professions and skills

where can I find a semantic ontology database of professions and skills? I need a database that would properly depict relations such as marketing->online marketing->search engine marketing ...

application of AI/neural networks/machine learning in stock market trading: looking for a book(s)

Hello. I'm looking for a book(s) about practical application of machine learning, artificial intelligence and neural networks to stock market trading (automated trading or as an assistance to human, mostly automatic trading). I'm not afraid of "heavy reading". What I'm interested in: 0. How can the problem (how to achieve goal dependin...

LGPL Machine Learning with Random Forest - C++

I am looking for a library with following features: Minimalistics with Random Forest learning and classification LGPL licenced In C++ CMake build system - not compulsory So far Waffles looks good, any other contenders ? ...

Open Alternatives to Google Prediction API

A recent announcement by Google about the Google Prediction API sounded very interesting. It could be useful for a project that is coming up, and would probably do a better job than some custom code I was considering. However, there is some vendor lock-in. Google retain the trained model, and could later choose to overcharge me for it. ...

In OpenCV, what is the svm.predect parameter returnDFVal?

I am using openCV but I can't find anything in the documentation about what the parameter returnDFVal means in the predict method for support vector machines. Does anybody else know? ...

Is a KD-Tree a unique ordering of a given data set?

Given a set of data points, a kdtree is created over them, but is this kdtree a unique one? ...

how to cluster evolving data streams

Hi Guys, I want to incrementally cluster text documents reading them as data streams but there seems to be a problem. Most of the term weighting options are based on vector space model using TF-IDF as the weight of a feature. However, in our case IDF of an existing attribute changes with every new data point and hence previous clusterin...

How to test an Machine Learning or statistic NLP algorithm implementation pack?

Hi, guys I am working on testing several Machine Learning algorithm implementations, checking whether they can work as efficient as described in the papers and making sure they could offer a great power to our statistic NLP (Natural Language Processing) platform. Could u guys show me some methods for testing an algorithm implementation...

How to detect vulnerable/personal information in CVs programmatically (by means of syntax analysis/parsing etc...)

To make matter more specific: How to detect people names (seems like simple case of named entity extraction?) How to detect addresses: my best guess - find postcode (regexes); country and town names and take some text around them. As for phones, emails - they could be probably caught by various regexes + preprocessing Don't care about...

Automatic text translation

What tools or web services are available for machine text translation. For example ENGLISH TEXT > SERVER or LIB > GERMAN TEXT Libraries are also acceptable. Is Google language API the only one ? ...

interpreting Naive Bayes results

i start using NaiveBayes/Simple classifier for classification (Weka), however i have some problems to understand while training the data. The data set i'm using is weather.nominal.arff. While i use use training test from the options, the classifier result is : Correctly Classified Instances 13 - 92.8571 % Incorrectly Classif...

MATLAB kMeans does not always converge to global minima

I wrote a k-Means clustering algorithm in MATLAB, and I thought I'd try it against MATLABs built in kmeans(X,k). However, for the very easy four cluster setup (see picture), MATLAB kMeans does not always converge to the optimum solution (left) but to (right). The one I wrote does not always do that either, but should not the built-in f...

Classifier performance on subset of data

I'm using Weka to perform classification on a set of labelled web pages, and measuring classifier performance with AUC. I have a separate six-level factor that is not used in classification, and I'd like to know how well classifiers perform on each level of the factor. What techniques or measures should I use to test classifier performa...

Calculating pageranks for a sparse directed graph with high percentage of deadlinks

Hi, I am a graduate student in computer science at Indiana University, Bloomington. For one of my research projects, i am working on calculating pageranks for a directed graph which is very sparse and has a high percentage of deadlinks. By deadlinks I mean nodes that have outdegree zero. Sometimes, in a graph with a lot of deadlinks, ...

Conditional Random Fields -- How Do They Work?

I've read the papers linked to in this question. I half get it. Can someone help me figure out how to implement it? I assume that features are generated by some heuristic. Using a POS-tagger as an example; Maybe looking at training data shows that 'bird' is tagged with NOUN in all cases, so feature f1(z_(n-1),z_n,X,n) is generated as (...