bayesian

Calculating spam probability

I am building a website in python/django and want to predict wether a user submission is valid or wether it is spam. Users have an accept rate on their submissions, like this website has. Users can moderate other users' submissions; and these moderations are later metamoderated by an admin. Given this: the registered user A with an ...

Classifying Documents into Categories

I've got about 300k documents stored in a Postgres database that are tagged with topic categories (there are about 150 categories in total). I have another 150k documents that don't yet have categories. I'm trying to find the best way to programmaticly categorize them. I've been exploring NLTK and its Naive Bayes Classifier. Seems li...

Clojure or Scheme bayesian classification libraries?

Any pointers to scheme/racket or clojure bayesian classification libraries? I need one for a toy/learning project that I'm going to do. ...

Summarizing Bayesian rating formula

Guys, Based on this url i found Bayesian Rating, which explains the rating model very well, i wanted to summarize the formula to make it much easier for anyone implementing an SQL statement. Would this be correct if i summarized the formula like this? avg_num_votes = Sum(votes)/Count(votes) * Count(votes) avg_rating = sum(votes)/...

How do i implement Bayesian algorithm for my five star rating system?

I want to implement a 5 star rating system on my site, and i have been trying use the bayesian rating algorithm explained here and here with no success. This is my scenario; I have three items (A, B and C) that need to be rated by a vote of 1 for an UP vote and a 0 for DOWN vote. In the database i have the following; Sum(A) = 500 UP ou...

classifier4J problem

I'm using the BayesianClassifier class to classify spam. The problem is that compound words aren't being recognized. For instance if I add led zeppelin as a match, a sentence containing it won't be recognized as a match even though it should. For adding a match I'm using addMatch() of SimpleWordsDataSource And for asking for a match I...

What is the difference between a Decision Tree and a Bayesian Network?

If I understand it right, both use Bayes Theorem to generate an acyclic graph and calculate percentages based on functions applied at every node. What is the difference? ...

AI / Statistical methods for determining the name of a colour

I'm thinking about writing a little library to make a guess at the name of an (RGB value) colour, from a predetermined list of candidates. My first attempt was based purely on pythagorean distance within the three-dimensional RGB colour space - this wasn't massively succesful as most of the named colour points were at the edges of the s...

Implementing a question analyzer for auto tagging

What are good resources to go to for implementing a question analyzer? I am trying to figure out how to auto-tag questions to make it easier for non-technical users to ask questions. I've found that using Bayes Theorem I can achieve this, but I have no idea how to implement it. Any open source libraries or research papers on this? ...

Surface Reconstruction from Contours with Quick Rescaling

I'm looking to construct a 3-D surface of a part of the brain based on 2-D contours from cross-sectional slices from multiple angles. Once I get this shape, I want to "fit" it to another set of contours via rescaling. I'm aspiring to do this in the context of an MCMC analysis, so it would be very nice if I could easily compute the volum...

Bayesian rating system with multiple categories for each rating

I'm implementing a rating system to be used on my website, and I think the Bayesian average is the best way to go about it. Every item will be rated in six different categories by the users. I don't want items with only one high rating to shoot to the top though, which is why I want to implement a Bayesian system. Here is the formula: ...

PHP implementation of Bayes classificator: Assign topics to texts

In my news page project, I have a database table news with the following structure: - id: [integer] unique number identifying the news entry, e.g.: *1983* - title: [string] title of the text, e.g.: *New Life in America No Longer Means a New Name* - topic: [string] category which should be chosen by the classificator, e.g: *Sports* ...

Whats the best open source bayesian software for trouble shooting?

Hi I have seen www.dezide.com as a top of the line trouble shooting software based on bayesian networking. But I need an open source solution to develop further as this is not for a commercial project. What would you recommend? BR Morten ...

Looking for open source naive Bayesian Classifier in C# for a Twitter sentiment analysis project.

I've found a similar project here: http://stackoverflow.com/questions/573768/sentiment-analysis-for-twitter-in-python . However, I'm working on C# and need to use a naive Bayesian Classifier that is open source in the same language. Unless someone can shed light on how I can utilize a python Bayesian Classifier to achieve the same goals....

pythonic implementation of Bayesian networks for a specific application

This is why I'm asking this question: Last year I made some C++ code to compute posterior probabilities for a particular type of model (described by a Bayesian network). The model worked pretty well and some other people started to use my software. Now I want to improve my model. Since I'm already coding slightly different inference algo...

Translating Ruby code to Java

Hi all, I have never used ruby but need to translate this code to java. Can anyone help me. This is the code in Ruby. DEFAULT_PRIOR = [2, 2, 2, 2, 2] ## input is a five-element array of integers ## output is a score between 1.0 and 5.0 def score votes, prior=DEFAULT_PRIOR posterior = votes.zip(prior).map { |a, b| a + b } sum = posteri...

Laplacian smoothing to Biopython

Hi, I am trying to add Laplacian smoothing support to Biopython's Naive Bayes code 1 for my Bioinformatics project. I have read many documents about Naive Bayes algorithm and Laplacian smoothing and I think I got the basic idea but I just can't integrate this with that code (actually I cannot see which part I will add 1 -laplacian num...

Bayesian Classification for Text Author Identification

I am interested in building my own text author identification system using C#. I am assuming that I will probably have to use some type of Bayesian Classification algorithm to accomplish this. Does anyone know of any resources or existing libraries out there that do something similar to this? ...