machine-learning

Algorithm to classify a list of products? Take 2.

Hello all, I asked a question similar to this one a couple of weeks ago, but I did not ask the question correctly. So I am re-asking here the question with more details and I would like to get a more AI oriented answer. I have a list representing products which are more or less the same. For instance, in the list below, they are all S...

TDD and the Bayesian Spam Filter problem

It's well known that Bayesian classifiers are an effective way to filter spam. These can be fairly concise (our one is only a few hundred LoC) but all core code needs to be written up-front before you get any results at all. However, the TDD approach mandates that only the minimum amount of code to pass a test can be written, so given t...

Neural networks for email spam detection

Let's say you have access to an email account with the history of received emails from the last years (~10k emails) classified into 2 groups genuine email spam How would you approach the task of creating a neural network solution that could be used for spam detection - basically classifying any email either as spam or not spam? Let'...

Retrieve a list of the most popular GET param variations for a given URL?

I'm working on building intelligence around link propagation, and because I need to deal with many short URL services where a reverse-lookup from an exact URL address is required, I need to be able to resolve multiple approximate versions of the same URL. An example would be a URL like http://www.example.com?ref=affil&hl=en&ct=0...

C/C++ Machine Learning Libraries for Clustering

What are some C/c++ Machine learning libraries that supports clustering of multi dimensional data? (for example K-Means) So far I have come across SGI MLC++ http://www.sgi.com/tech/mlc/ OpenCV MLL I am tempted to roll-my-own, but I am sure pre-existing ones are far better performance optimized with more eyes on code. ...

Music analysis software

Greetings I may have imagined this but does anyone know if Last.fm previously used some form of open source project to perform analysis on music to determine similar music. As its now moved to a pay version I'd like to make something which can add known music to my playlist. (I hate scanning my computer for similar music manually) F...

Best books, blogs, link, reading about AI and machine learning.

I think that AI might be a precious tool in the developer's toolbox, and I'd like to know more about this field hoping that it will make my life easier as a developer. Can you recommend some classics, books, links, essays, authors, reading, and bloggers? What is the Code Complete of the AI field ? ...

Neural network XOR backpropagation info needed.

Does anyone know where I can find some sample codes about the NN Back propagation for XOR, that I can also test the system after it was trained? Preferably in C++ or MATLAB. ...

Database of surveillance camera locations

To get more into django programming I'm planning to create a google maps mashup, which finds routes from A to B, but avoids streets/junctions that cross public surveillance cameras' perspectives. Therfore I will create a database (probably Postgres based, because of its GIS capabilities) containing surveillance type (surveillance camer...

Extracting surveillance camera positions from streetview images

Related to my previous question, is there some realistic chance to extract surveillance camera positions out of google streetview pictures by means of computer vision algorithms? I'm no expert in that area. But it should be easier than face detection and the like. ...

What are some popular OCR algorithms?

I've been interested in machine learning and computer vision for a while, so I've decided to attempt to build a simple Optical Character Recognition demo in C#. I'm looking for a description of some common OCR algorithms and how I would go about implementing them in C#. It's a learning exercise so I'm not looking for an OCR library. ...

How does Wolfram Alpha work?

Behind the tables and tables of raw data, how does Wolfram Alpha work? I imagine there are various artificial intelligence mechanisms driving the site but I can't fathom how anyone would put something like this together. Are there any explanations that would help a programmer understand how something like this is created? Does the knowl...

What is the difference between a Generative and Discriminative Algorithm?

Please help me understand the difference between a Generative and Discriminative Algorithm keeping in mind that I am just a beginner. ...

what's a good language for learning machine learning?

I've been thrust into a situation where I need to know something about Machine Learning. Is there a language or perhaps a reasonable tutorial that breaks this subject matter in gently? I'm not a math guy, so it's got to start from a pretty basic level. ...

Machine learning for typos.

Google has suggestion come up when you make a typo entry,how do they do it? ...

C# AI Library

Can anyone suggest a good AI library written in C#? I specifically want to use it for ILP so first order logic support is a must. ...

Need good way to choose and adjust a "learning rate"

In the picture below you can see a learning algorithm trying to learn to produce a desired output (the red line). The learning algorithm is similar to a backward error propagation neural network. The "learning rate" is a value that controls the size of the adjustments made during the training process. If the learning rate is too high,...

Machine Learning in Game AI

In the old days of gaming, I'm sure simple switch/case statements (in a sense) would have done just fine for most of the game "AI." However, as games have become increasing complex, especially at the 3d leap, more complex algorithms are needed. My question is, are actual machine learning algorithms (like reinforcement learning) used in g...

How many device drivers are available for Windows

I'm trying to estimate, for back-of-the-napkin calculation purposes, how many different device drivers are available for Windows. I'm trying to understand what it might take in terms of size of collected data and processing power what would be required to do some statistical analysis of drivers. Anybody have any references? Ideas? At...

Kernel methods for large scale dataset

Kernel-based classifier usually requires O(n^3) training time because of the inner-product computation between two instances. To speed up the training, inner-product values can be pre-computed and stored in a two-dimensional array. However when the no. of instances is very large, say over 100,000, there will not be sufficient memory to d...