I was having a look at this awesome tutorial on single layer perceptron. I tried the implementation out and it works like charm, but I was wondering if it's there any practical use for it as is (at such a low complexity degree).

Any example?

I was having a look at this awesome tutorial on single layer perceptron. I tried the implementation out and it works like charm, but I was wondering if it's there any practical use for it as is (at such a low complexity degree).

Any example?

A:

I think they are used in some sort of spam filters, but darn if I remember the details at this hour...

Treb
2008-10-27 18:59:34

you're probably thinking of bayesian networks

Steven A. Lowe
2008-10-27 20:05:12
No, I don't. I remember reading something about a combination of several filter methods, one of them was bayesian. And for another method they used a single perceptron. Just can't remember the source anymore.

Treb
2008-10-28 09:09:26
it would be great to find some more info on this

JohnIdol
2008-10-30 13:03:41
+1 to encourage you to find the info ;-)

Steven A. Lowe
2008-10-31 04:52:45
Thanks Steven, I have already tried and I just can not find it. Will devote a few hours of the weekend to it (because of the +1 moral boost ;-)

Treb
2008-10-31 07:24:05
One such article is here: http://www.paulgraham.com/spam.html These early techniques used the so-called Naive Bayes algorithm, which is equivalently powerful to perceptrons, and have almost nothing to do with Bayesian networks, besides both making reference to Bayes's Law.

John the Statistician
2008-11-11 22:57:49
+1
A:

you might want to wait for the multi-layer perceptron tutorial; single-layer perceptrons are incredibly limited - see *Perceptrons* by Minsky and Papert for an authoritative (and highly mathematical) study of what they can and cannot do.

Steven A. Lowe
2008-10-27 20:04:50

I know they're limited - just wondering if there's any immediate practical application

JohnIdol
2008-10-27 20:46:58
Steven A. Lowe
2008-10-27 20:59:51

+6
A:

You can actually do an incredible amount with just a perceptron. For example, many of the theoretical weaknesses of perceptrons can be overcome by moving to a richer feature representation of the data. The most standard way to do this is through kernels. Once you do this, you can then solve many different learning problems through reductions that transform these other problems into binary classification.

One major algorithm is SNOW, which is used by Dan Roth in natural language processing quite heavily

Lecture notes on the use of kernels in the perceptron algorithm can be found here: http://l2r.cs.uiuc.edu/~danr/Teaching/CS446-08/Lectures/04-LecOnline-P3.pdf

The rest of the notes on perceptron and winnow are handy as well: http://l2r.cs.uiuc.edu/~danr/Teaching/CS446-08/lectures.html

A further discussion of kernel perceptrons can be found in this paper: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.48.8200

John Langford (not me, by the way) has done a lot of work in the reductions I mentioned: http://hunch.net/~jl/projects/reductions/reductions.html

John the Statistician
2008-11-11 22:54:58

sounds great - can you provide examples or link to some good resources in this direction?

JohnIdol
2008-11-14 11:07:32
Lecture notes on the use of kernels in the perceptron algorithm can be found here: http://l2r.cs.uiuc.edu/~danr/Teaching/CS446-08/Lectures/04-LecOnline-P3.pdf The rest of the notes on perceptron and winnow are handy as well: http://l2r.cs.uiuc.edu/~danr/Teaching/CS446-08/lectures.html

John the Statistician
2008-11-15 22:54:21
A further discussion of kernel perceptrons can be found in this paper: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.48.8200

John the Statistician
2008-11-15 22:56:14
John Langford (not me, by the way) has done a lot of work in the reductions I mentioned: http://hunch.net/~jl/projects/reductions/reductions.html

John the Statistician
2008-11-15 22:57:34
+1
A:

The paper Neural methods for dynamic branch prediction describes how perceptrons can be used to predict whether a instruction branch will be taken or not in hardware with an accuracy of over >95%.

namin
2008-11-13 04:19:19

+1
A:

A single layer perceptron is really just a fairly inefficient and inaccurate way of finding a least squares solution to a linear system. More efficient methods might use Singular Value Decomposition ( SVD ) to find the pseudoinverse which amounts ( I think ) to the doing the same thing that the single layer perceptron does while learning. But that said, finding least sqaures solutions is a generally useful thing, so in that sense the single layer perceptron is doing something pratical!

Tom Grove
2008-11-14 15:44:45

So with a single layer perceptron I can do all the stuff in the article you link under 'typical uses and applications'?

JohnIdol
2008-11-21 08:57:29
Correct, almost. There's much easier ways to calculate the pseudoinverse than SVD, one of which is simply computing (X^t X)^-1 X^t with your favourite algebra package, or else a cleverer iterative method. Simple perceptron learning, in general, sucks, and diverges if the step size is too high.

dwf
2009-01-18 11:49:55
A:

Perceptrons are simply a way of determining a (thresholded) linear combination (weighted sum of inputs). In practice, nearly nobody uses the perceptron procedure for learning because it's ridiculously inefficient and not guaranteed to ever find a solution if you choose your learning rate badly.

If you're predicting a real-valued quantity then perceptrons are equivalent to linear regression. If you're interested in binary classification, logistic regression is the tool to use, as it provides you a posterior probability of the likelihood of a given classification, given some (fairly reasonable) assumptions (that your predictive variables are Gaussian distributed, with different means but the same covariance).

dwf
2009-01-18 12:01:28