algorithm

Natural Language CFG builder Algorithm

Hello, I am working in a natural language processing project. It aims to build libraries for Arabic language. We working on a POS tagger and now I am thinking in grammar phase. Since Arabic language and many others have complicated grammar, so it is very hard to build their context free grammar (CFG). For this reason I had an idea for an...

optimal allocation of products to maximize time before restocking

stock allocation problem. I have a problem where each of a known set of products with various rates of sale need to be allocated into one of more of a fixed number of buckets. Each product must be in at least one bucket and buckets cannot share product. All buckets must be filled, and products will usually be in more than one bucket My ...

Python recursive program to prime factorize a number

I wrote the following program to prime factorize a number: import math def prime_factorize(x,li=[]): until = int(math.sqrt(x))+1 for i in xrange(2,until): if not x%i: li.append(i) break else: #This else belongs to for li.append(x) print li #First print state...

How are application like twitter implemented?

Suppose A follows 100 person, then will need 100 join statement, which is horrible for database I think. Or there are other ways ? ...

What is the fastest possible way to sort an array of 7 integers?

This is a part of a program that analyzes the odds of poker, specifically Texas Hold'em. I have a program I'm happy with, but it needs some small optimizations to be perfect. I use this type (among others, of course): type T7Cards = array[0..6] of integer; There are two things about this array that may be important when decidin...

String table encoding vs. gzip compression

In my application, I need to store and transmit data that contains many repeating string values (think entity names in an XML document). I have two proposed solutions: A) create a string table to be stored along the document, and then use index references (using multi-byte encoding) in the document body, or B) simply compress the do...

GPU vs CPU performance for common algorithms

I'm interested to know if any common algorithms (sorting, searching, graphs, etc.) have been ported to OpenCL (or any GPU language), and how the performance compares to the same algorithm executed by the CPU. I'm specifically interested in the results (numbers). Thanks! ...

HSL Interpolation

If the hue component of my HSL color is in degrees, how can I correctly interpolate between two hues? Using (h1 + h2) / 2 does not seem to produce desirable results in all cases. Here are two examples that illustrate why: Let: red = 0° yellow = 60° blue = 240° (red + yellow) / 2 = 30° (orange) (yellow + blue) / 2 = 150° (blue gree...

Optimization problem

I am too dense to solve the following optimization problem: There is a 2D array, let's say symbols vs time, for example A 1114334221111 B 9952111111111 C 1113439111131 D 1255432245662 There is also a list of symbols, for example: CABDC You must choose values from the array in order of the symbols, but you can repeat a symbol as m...

Relative asymptotic behavior of these functions

I have 3 functions: f(n)=2n, g(n)=n! and h(n)=nlog(n) (log(n) is base 2). Comparing f(n) and g(n): The factorial function, g(n) can be approximated as O(nn) (poor upper bound). Considering this, Is g(n)=Ω(f(n)) ? How would I compare g(n) and h(n), and f(n) and h(n)? ...

Indexing algorithms to develop an app like google desktop search ?

Hi Friends, I want to develop google desktop search like application, I want to know that which Indexing Techniques/ Algorithms I should use so I can get very fast data retrival. Thanks, Sunny. ...

Efficiency of Sort Algorithms

I am studying up for a pretty important interview tomorrow and there is one thing that I have a great deal of trouble with: Sorting algorithms and BigO efficiencies. What number is important to know? The best, worst, or average efficiency? ...

Generating Luhn Checksums

There are lots of implementations for validating Luhn checksums but very few for generating them. I've come across this one however in my tests it has revealed to be buggy and I don't understand the logic behind the delta variable. I've made this function that supposedly should generated Luhn checksums but for some reason that I haven't...

Algorithm for nice graph labels for time/date axis?

Hello, I'm looking for a "nice numbers" algorithm for determining the labels on a date/time value axis. I'm familar with Paul Heckbert's Nice Numbers algorithm (http://tinyurl.com/5gmk2c). I have a plot that displays time/date on the X axis and the user can zoom in and look at a smaller time frame. I'm looking for an algorithm that p...

Enabling soundex/metaphone for non-English characters

Hi all, I've been studying soundex, metaphone and other string search techniques the past few days, and in my understanding both algorithms work well in handling non-English words transliterated to English. However the requirement that I have would be for such search to work in the original, untransliterated languages, accomodating alp...

How to find distance from the latitude and longitude of two locations?

I have a set of latitudes and longitudes of locations. How to find distance from one location in the set to another? Is there a formula? ...

visiting all free slots in a bitfield

I have an array of uint64 and for all unset bits (0s), I do some evaluations. The evaluations are not terribly expensive, but very few bits are unset. Profiling says that I spend a lot of time in the finding-the-next-unset-bit logic. Is there a faster way (on a Core2duo)? My current code can skip lots of high 1s: for(int y=0; y<hei...

Get dominant colors from image discarding the background

What is the best (result, not performance) algorithm to fetch dominant colors from an image. The algorithm should discard the background of the image. I know I can build an array of colors and how many they appear in the image, but I need a way to determine what is the background and what is the foreground, and keep only the second (for...

how would you design a "state/management" that would do this

I have a list of nullable integer and it look like a 1 to many relation id1 can have an id2 and id3 id2 can have id4 if the value of id1 change, id2 and id3 must be set to null and it mean id4 must be set to null In that example, I only use 4 variables so that easy to manage. I got, for now, at least 15 variables to manage. The w...

How does Facebook do it?

Have you ever noticed how facebook says “3 friends and 33 others liked this”? I was wondering what the best approach to do this is. I don’t think going through the friends list, and the list of users who “liked this” and comparing them is efficient at all! Do they keep a track of this in the database? That will make the database size ver...