fuzzy

How to print "fuzzy" time/date delta in python?

Possible Duplicate: Natural/Relative days in Python Does anyone know where to find a python module that can print a time tuple in format "5 seconds ago", "2 hours ago", "Yesterday", "3 weeks ago" etc? ...

Is There a Algorithm or Library that can Detect Motion Blur in Images?

Anyone know of an algorithm that can return fuzzy true/false to if a image has motion blur / camera shake in a image? Ideally it would be particular to motion blur, as lots of the images in the set might have blurred (Bokeh) backgrounds. A language preference would be C, Perl, Shell Utility, or Python, but I'm open to anything really....

q-gram approximate matching optimisations

Hi I have a table containing 3 million people records on which I want to perform fuzzy matching using q-grams (on surname for instance). I have created a table of 2-grams linking to this, but search performance is not great on this data volume (around 5 minutes). I basically have two questions: (1) Can you suggest any ways to improve p...

Lucene Query WITHOUT Operators

I am trying to use Lucene to search for names in a database. However, some of the names contain words like "NOT" and "OR" and even "-" minus symbols. I still want the different tokens inside the names to be broken up using an Analyzer and searched upon as a boolean combination of terms, but I do not want Lucene to interpret any of the "N...

Practical uses of fuzzy theory?

Hi all, from your experience/from what you have encountered, what are some practical applications of fuzzy systems? I know about system control and fuzzy controllers, about intrinsic linguistic support, but here I refer to concrete applications of fuzzy, where this approach was proven to be successful or fitted very well. ...

Good fuzzy book to start with

Hi, from what you have read or heard about, which is a good book on fuzzy logic/sets/systems? I'm interested in basic of fuzzy systems, fuzzification/defuzzification etc. There are a plenty of such books, however my belief is that only few of them worth reading. Thanks ...

Merging two Data Frames using Fuzzy/Approximate String Matching in R

DESCRIPTION I have two datasets with information that I need to merge. The only common fields that I have are strings that do not perfectly match and a numerical field that can be substantially different The only way to explain the problem is to show you the data. Here is a.csv and b.csv. I am trying to merge B to A. There are three...

Levenshtein distance on non-English strings

Will the Levenshtein distance algorithm work well for non-English language strings too? Update: Would this work automatically in a language like Java when comparing Asian characters? ...

Fuzzy Regular Expressions

In my work I have with great results used approximate string matching algorithms such as Damerau–Levenshtein distance to make my code less vulnerable to spelling mistakes. Now I have a need to match strings against simple regular expressions such TV Schedule for \d\d (Jan|Feb|Mar|...). This means that the string TV Schedule for 10 Jan s...

fuzzy logic and neural networks

what is the use of activation function with any two activation function? ...

Where can I find FuzzyGK or other fuzzy clustering algorithms to use in Weka?

I'm learning Weka and I'm trying to figure out how to do fuzzy clustering. I found an old site that listed 2 fuzzy clusterers (by Frank Weber & Robin Senge), but I could not add their .jar file to the existing .jar file to use the algorithms. Does anyone know where I can find fuzzy clustering algorithms for Weka? If not, is there an...

Comparing (similar) images with Python/PIL

I'm trying to calculate the similarity (read: Levenshtein distance) of two images, using Python 2.6 and PIL. I plan to us e the python-levenshtein library for fast comparison. Main question: What is a good strategy for comparing images? My idea is something like: Convert to RGB (transparent -> white) (or maybe convert to monochrome?...

Lucene query: bla~* (match words that start with something fuzzy), how?

In the Lucene query syntax I'd like to combine * and ~ in a valid query similar to: bla~* //invalid query Meaning: Please match words that begin with "bla" or something similar to "bla". ...

Django's makemessages creates a lot of fuzzy entries

Each time I added some strings to a Django project, I run "django-admin.py makemessages -all" to generate .PO files for all locales. The problem is even I only added 5 news strings, the makemessages command will mark 50 strings as fuzzy in .PO files which brings a lot of extra work for our locale maintainers. This also makes the entire...

Is this a variation of the traveling salesman problem?

I'm interested in a function of two word lists, which would return an order agnostic edit distance between them. That is, the arguments would be two lists of (let's say space delimited) words and return value would be the minimum sum of the edit (or Levenshtein) distances of the words in the lists. Distance between "cat rat bat" and ...

"Did you mean" feature on a dictionary database

I have a ~300.000 row table; which includes technical terms; queried using PHP and MySQL + FULLTEXT indexes. But when I searching a wrong typed term; for example "hyperpext"; naturally giving no results. I need to "compansate" little writing errors and getting nearest record from database. How I can accomplish such feaure? I know (actua...

Algorithm detect repeating/similiar strings in a corpus of data -- say email subjects, in Python

I'm downloading a long list of my email subject lines , with the intent of finding email lists that I was a member of years ago, and would want to purge them from my Gmail account (which is getting pretty slow.) I'm specifically thinking of newsletters that often come from the same address, and repeat the product/service/group's name in...

Alternatives to Lucene Default Fuzzy Matching Implementation

Lucene fuzzy matching uses a basic editDistance algorithm to implement fuzzy matching. Are there other implementations of fuzzy matching for Lucene which use other similarity metrics? They should identify homphones also. Also please compare various fuzzy matching approaches for lucene. ...

Simplifying a four-dimensional rule table in Matlab: addressing rows and columns of each dimension

Hi all. I'm currently trying to automatically generate a set of fuzzy rules for a set of observations which contain four values for each observation, where each observation will correspond to a state (a good example is with Fisher's Iris Data). In Matlab I am creating a four dimensional rule table where a single cell (a,b,c,d) will con...

Splitting a set of object into several subsets according to certain evaluation

Suppose I have a set of objects, S. There is an algorithm f that, given a set S builds certain data structure D on it: f(S) = D. If S is large and/or contains vastly different objects, D becomes large, to the point of being unusable (i.e. not fitting in allotted memory). To overcome this, I split S into several non-intersecting subset...