algorithm

Asking for help: What is a decent beginner graph problem?

I'm trying to get more acquainted with problems that require Graphs to be solved (are are best solved by graphs). If someone has an old ACM Programming Competition problem that utilized graphs, or have another problem that they found particularly enlightening as they worked it out I would appreciate it. I want to familiarize myself w...

How do you efficiently generate a list of K non-repeating integers between 0 and an upper bound N

The question gives all necessary data: what is an efficient algorithm to generate a sequence of K non-repeating integers within a given interval. The trivial algorithm (generating random numbers and, before adding them to the sequence, looking them up to see if they were already there) is very expensive if K is large and near enough to N...

Mahjong - Arrange tiles to ensure at least one path to victory, regardless of layout.

Regardless of the layout being used for the tiles, is there any good way to divvy out the tiles so that you can guarantee the user that, at the beginning of the game, there exists at least one path to completing the puzzle and winning the game? Obviously, depending on the user's moves, they can cut themselves off from winning. I just wa...

Django/Python - Grouping objects by common set from a many-to-many relationships

This is a part algorithm-logic question (how to do it), part implementation question (how to do it best!). I'm working with Django, so I thought I'd share with that. In Python, it's worth mentioning that the problem is somewhat related to how-do-i-use-pythons-itertoolsgroupby. Suppose you're given two Django Model-derived classes: fro...

Visiting the points in a triangle in a random order..

For a right triangle specified by an equation aX + bY <= c on integers I want to plot each pixel(*) in the triangle once and only once, in a pseudo-random order, and without storing a list of previously hit points. I know how to do this with a line segment between 0 and x pick a random point'o' along the line, pick 'p' that ...

Best algorithm for synchronizing two IList in C# 2.0

Imagine the following type: public struct Account { public int Id; public double Amount; } What is the best algorithm to synchronize two IList<Account> in C# 2.0 ? (No linq) ? The first list (L1) is the reference list, the second (L2) is the one to synchronize according to the first: All accounts in L2 that are no longer pr...

Iterating shuffled [0..n) without arrays

I know of a couple of routines that work as follows: Xn+1 = Routine(Xn, max) For example, something like a LCG generator: Xn+1 = (a*Xn + c) mod m There isn't enough parameterization in this generator to generate every sequence. Dream Function: Xn+1 = Routine(Xn, max, permutation number) This routine, para...

Algorithm to calculate next set in sequence

I am looking for an algorithm to calculate the next set of operations in a sequence. Here is the simple definition of the sequence. Task 1A will be done every 500 hours Task 2A will be done every 1000 hours Task 3A will be done every 1500 hours So at t=500, do 1A. At t=1000, do both 1A and 2A, at t=1500 do 1A and 3A, but not 2A as 15...

How do I test if a given BSP tree is optimal?

I have a polygon soup of triangles that I would like to construct a BSP tree for. My current program simply constructs a BSP tree by inserting a random triangle from the model one at a time until all the triangles are consumed, then it checks the depth and breadth of the tree and remembers the best score it achieved (lowest depth, lowes...

Fast Text Search Over Logs

Here's the problem I'm having, I've got a set of logs that can grow fairly quickly. They're split into individual files every day, and the files can easily grow up to a gig in size. To help keep the size down, entries older than 30 days or so are cleared out. The problem is when I want to search these files for a certain string. Right n...

Methods for Geotagging or Geolabelling Text Content

What are some good algorithms for automatically labeling text with the city / region or origin? That is, if a blog is about New York, how can I tell programatically. Are there packages / papers that claim to do this with any degree of certainty? I have looked at some tfidf based approaches, proper noun intersections, but so far, no...

Choosing a pivot for Quicksort?

When implementing Quicksort one of the things you have to do is choose a pivot. But when I look at pseudocode like the one below. It is not clear how is should choose the pivot. First element of list? Something else? function quicksort(array) var list less, greater if length(array) ≤ 1 return array select an...

How to rank a million images with a crowdsourced sort

I'd like to rank a collection of landscape images by making a game whereby site visitors can rate them, in order to find out which images people find the most appealing. What would be a good method of doing that? Hot-or-Not style? I.e. show a single image, ask the user to rank it from 1-10. As I see it, this allows me to average the ...

.NET Date Compare: Count the amount of working days since a date?

What's the easiest way to compute the amount of working days since a date? VB.NET preferred, but C# is okay. And by "working days", I mean all days excluding Saturday and Sunday. If the algorithm can also take into account a list of specific 'exclusion' dates that shouldn't count as working days, that would be gravy. Thanks in advance...

Creating a random ordered list from an ordered list.

I have an application that takes the quality results for a manufacturing process and creates graphs both to show Pareto charts of the bad, and also to show production throughput. To automate the task of testing these statistical procedures I would like to deterministically be able to add records into the database and have the quality t...

how to identify the minimal set of parameters describing a data set

I have a bunch of regression test data. Each test is just a list of messages (associative arrays), mapping message field names to values. There's a lot of repetition within this data. For example test1 = [ { sender => 'client', msg => '123', arg => '900', foo => 'bar', ... }, { sender => 'server', msg => '456', arg...

Is it faster to sort a list after inserting items or adding them to a sorted list.

If I have a sorted list (say quicksort to sort), if I have a lot of values to add, is it better to suspend sorting, and add them to the end, then sort, or use binary chop to place the items correctly while adding them. Does it make a difference if the items are random, or already more or less in order? ...

Graph Problem: Help find the distance between the two most widely separated nodes.

I'm working through previous years ACM Programming Competition problems trying to get better at solving Graph problems. The one I'm working on now is I'm given an arbitrary number of undirected graph nodes, their neighbors and the distances for the edges connecting the nodes. What I NEED is the distance between the two farthest nodes...

Close point of approch detection

I have a large set of 3rd order polynomials in 3D. in matrix form [Pn](t) = [1,t,t^2,t^4]*[An] // [Pn] and [An] are matrices each function has a weight Wn. I want to, for some n, m, T and t0 find the first t where t>t0 such that Wn*Wm / |[Pn](t)-[Pm](t)|^2 > T aside from a the O(n^2) "try everything" approach I'm not even sure w...

Best way to track page views

I'm working on a e-commerce system and I would like to track the product page views, but I'm not sure of the best way to do that... My boss suggested that I should count one view for the first time a user opens a product page and then store a cookie saying it was already viewed, but what if the user doesn't accept cookies? Another option...