algorithm

Optimally assigning tasks to workers

I've been working on a comprehensive build system that performs distributed builds on multiple machines for quite some time now. It correctly handles dependencies and seemed to scale reasonably well, so we've added more projects and more machines, but it looks like it could perform better. The problem I have is one of resource allocatio...

Getting the lesser n elements of a list in Python

I need to get the lesser n numbers of a list in Python. I need this to be really fast because it's in a critical part for performance and it needs to be repeated a lot of times. n is usually no greater than 10 and the list usually has around 20000 elements. The list is always different each time I call the function. Sorting can't be mad...

unique path in a directed graph

Hi, I'm designing an algorithm for class that will determine if a graph is unique with respect to a vertex v such that for any u <> v there is at most one path from v to u. I've started by using BFS to find the shortest path from v to another vertex u, and then running BFS again to see if an alternate path can be found from v to u. I th...

Log combing algorithm

We get these ~50GB data files consisting of 16 byte codes, and I want to find any code that occurs 1/2% of the time or more. Is there any way I can do that in a single pass over the data? Edit: There are tons of codes - it's possible that every code is different. EPILOGUE: I've selected Darius Bacon as best answer, because I think t...

An algorithm to get the next weekday set in a bitmask

Hello all, I've got this small question - given a bitmask of weekdays (e.g., Sunday = 0x01, Monday = 0x02, Tuesday = 0x04, etc...) and today's day (in a form of Sunday = 1, Monday = 2, Tuesday = 3, etc...) - what's the most elegant way to find out the next day from today, that's set in the bitmask? By elegant I mean, is there a way to d...

How to best match two strings?

Hi, do you know any good algorithms that match two strings and then return a percentage in how many percent those two strings match? And are there some, that work with databases too? ...

Generating permutations lazily

I'm looking for an algorithm to generate permutations of a set in such a way that I could make a lazy list of them in Clojure. i.e. I'd like to iterate over a list of permutations where each permutation is not calculated until I request it, and all of the permutations don't have to be stored in memory at once. Alternatively I'm looking...

Weighted random selection with and without replacement

Recently I needed to do weighted random selection of elements from a list, both with and without replacement. While there are well known and good algorithms for unweighted selection, and some for weighted selection without replacement (such as modifications of the resevoir algorithm), I couldn't find any good algorithms for weighted sele...

Facial recognition/merging software

Can anyone point me in the right direction of some facial recognition libraries & algorithms ? I've tried searching/googling but i mostly find thesises and very little real software. ...

Story telling/building algorithms?

I'm working on a simple story generator and am looking for story building algorithms and patterns to use in my design. Anyone has some good recommendations? ...

Most efficient sorting algorithm for many identical keys?

What is the most efficient algorithm for grouping identical items together in an array, given the following: Almost all items are duplicated several times. The items are not necessarily integers or anything else that's similarly simple. The range of the keys is not even well-defined, let alone small. In fact, the keys can be arbitrar...

Number of arrangements

Suppose we have n elements, a1, a2, ..., an, arranged in a circle. That is, a2 is between a1 and a3, a3 is between a2 and a4, an is between an-1 and a1, and so forth. Each element can take the value of either 1 or 0. Two arrangements are different if there are corresponding ai's whose values differ. For instance, when n=3, (1, 0, 0) and...

Turn a N-Ary B-Spline into a sequence of Quadratic or Cubic B-Splines

Hi, I am doing some TTF work for MOSA (the correlating body between all the C# operating systems). Me and Colin Burn are currently working on getting some TTF code working (less me these days :) - he made a lot of progress). In any case, the TTF spec allows for an arbitrary amount of control points between the 'handles' and gasp NO han...

How do I efficiently segment 2D images into regions/blobs of similar values?

How do I segment a 2D image into blobs of similar values efficiently? The given input is a n array of integer, which includes hue for non-gray pixels and brightness of gray pixels. I am writing a virtual mobile robot using Java, and I am using segmentation to analyze the map and also the image from the camera. This is a well-known probl...

Algorithm for detecting "clusters" of dots

I have a 2D area with "dots" distributed on this area. I now am trying to detect "clusters" of dots, that is, areas with a certain high density of dots. Any thoughts on (or links to articles with thoughts on) how to elegantly detect these areas? ...

Good algorithm for drawing solid 2-dimensional polygons?

What is the simplest (and easiest, although that's subjective) algorithm for drawing solid (as in a single, solid color--no texture mapping) 2D polygons in memory? What is the most efficient method? I am not interested in using the GPU or any rendering method, as the output of my program will not be to the screen. ...

Remove duplicate items with minimal auxillary memory?

What is the most efficient way to remove duplicate items from an array under the constraint that axillary memory usage must be to a minimum, preferably small enough to not even require any heap allocations? Sorting seems like the obvious choice, but this is clearly not asymptotically efficient. Is there a better algorithm that can be d...

Shuffle a list (with duplicates) to avoid identical elements being next to each other

Hi, I am wondering if there is a "best" way to shuffle a list of elements that contains duplicates such that the case where array[i] == array[i+1] is avoided as much as possible. I am working on a weighted advertising display (I can adjust the number of displays per rotation for any given advertiser) and would like to avoid the same ad...

Load Balance in Distributed Project

Does anyone know a simple load balance algorithm (formula) that relates users connected, cpu load, network load and memory usage? This will be used to compare various servers and assign to a new user the best at the moment. Thank You. ...

Random but Regular Polygon Generator

I'm looking for a way to generate a set of random sided, but regular, polygons, inside a given rectangle or sector of a circle. To better explain, my given 2d space should have a random arrangement of regular polygons with various numbers of sides, so, e.g, if two hexagons are separated by a rectangle equal in length to their sides, the...