views:

821

answers:

6

My best shot so far:

A delivery vehicle needs to make a series of deliveries (d1,d2,...dn), and can do so in any order--in other words, all the possible permutations of the set D = {d1,d2,...dn} are valid solutions--but the particular solution needs to be determined before it leaves the base station at one end of the route (imagine that the packages need to be loaded in the vehicle LIFO, for example).

Further, the cost of the various permutations is not the same. It can be computed as the sum of the squares of distance traveled between di -1 and di, where d0 is taken to be the base station, with the caveat that any segment that involves a change of direction costs 3 times as much (imagine this is going on on a railroad or a pneumatic tube, and backing up disrupts other traffic).

Given the set of deliveries D represented as their distance from the base station (so abs(di-dj) is the distance between two deliveries) and an iterator permutations(D) which will produce each permutation in succession, find a permutation which has a cost less than or equal to that of any other permutation.

Now, a direct implementation from this description might lead to code like this:

function Cost(D) ...

function Best_order(D)
    for D1 in permutations(D)
        Found = true
        for D2 in permutations(D)
            Found = false if cost(D2) > cost(D1)
        return D1 if Found

Which is O(n*n!^2), e.g. pretty awful--especially compared to the O(n log(n)) someone with insight would find, by simply sorting D.

My question: can you come up with a plausible problem description which would naturally lead the unwary into a worse (or differently awful) implementation of a sorting algorithm?

+2  A: 

This isn't a direct answer, but I think more clarification is needed.

Is di allowed to be negative? If so, sorting alone is not enough, as far as I can see.

For example:

d0 = 0

deliveries = (-1,1,1,2)

It seems the optimal path in this case would be 1 > 2 > 1 > -1.

Edit: This might not actually be the optimal path, but it illustrates the point.

recursive
Good catch! I've amended the problem (by stating that the base station is at one end of the route) to eliminate the loophole, and voted your answer up. Thanks!
MarkusQ
A: 

Isn't this just the (NP-Hard) Travelling Salesman Problem? It doesn't seem likely that you're going to make it much harder.

Maybe phrasing the problem so that the actual algorithm is unclear - e.g. by describing the paths as single-rail railway lines so the person would have to infer from domain knowledge that backtracking is more costly.

What about describing the question in such a way that someone is tempted to do recursive comparisions - e.g. "can you speed up the algorithm by using the optimum max subset of your best (so far) results"?

BTW, what's the purpose of this - it sounds like the intent is to torture interviewees.

Steve B.
I think it is intended that it is a 1-dimensional space version of the TSP. But that's not exactly clear from the question. If that's true, it is significantly simpler than the full TSP.
recursive
It's not to torture it's to enlighten. The idea is to instill a health dose of critical thinking in people who have had too much experience with doing exactly what they are told as the royal road to success.
MarkusQ
Also, it may become a test case for a compiler optimization strategy debate on down the road.
MarkusQ
Last I checked, wasn't Traveling Salesman NP-Complete, not NP-Hard?
pfunk
@pfunk - The Hamiltonian Cycle problem (the decision version of TSP) is NP-Complete and that is a subset of TSP therefore TPS is NP-Hard.
Gavin Miller
+6  A: 

I assume you're using this question for an interview to see if the applicant can notice a simple solution in a seemingly complex question.

[This assumption is incorrect -- MarkusQ]

You give too much information.

The key to solving this is realizing that the points are in one dimension and that a sort is all that is required. To make this question more difficult hide this fact as much as possible.

The biggest clue is the distance formula. It introduces a penalty for changing directions. The first thing an that comes to my mind is minimizing this penalty. To remove the penalty I have to order them in a certain direction, this ordering is the natural sort order.

I would remove the penalty for changing directions, it's too much of a give away.

Another major clue is the input values to the algorithm: a list of integers. Give them a list of permutations, or even all permutations. That sets them up to thinking that a O(n!) algorithm might actually be expected.

I would phrase it as:

Given a list of all possible permutations of n delivery locations, where each permutation of deliveries (d1, d2, ..., dn) has a cost defined by:

Return permutation P such that the cost of P is less than or equal to any other permutation.

All that really needs to be done is read in the first permutation and sort it.

If they construct a single loop to compare the costs ask them what the big-o runtime of their algorithm is where n is the number of delivery locations (Another trap).

Ben S
Ummm... why doesn't my latex image show up? I shows up in the preview.
Ben S
It's not an interview question. I'm not hiring any more than the next guy. It's an example to prod people's thinking.
MarkusQ
The bog-O cost isn't any greater than my version (none were), but the teachable points count is higher, so I'm going to selected this one.
MarkusQ
+1  A: 

YOu could rephrase it, having first found the optimal solution, as

"Give me a proof that the following convination is the most optimal for the following set of rules, where optimal means the smallest number results from the sum of all stage costs, taking into account that all stages (A..Z) need to be present once and once only.

Convination:

A->C->D->Y->P->...->N

Stage costs:

A->B = 5,
B->A = 3,
A->C = 2,
C->A = 4,
...
...
...
Y->Z = 7,
Z->Y = 24."

That ought to keep someone busy for a while.

dsm
But this leaves them with the non-trivial problem of analyzing the table. Probably more realistic (e.g. the client who gives you the costs as two spread sheets, a manila folder of marked up printouts of earlier version, and some post-it notes) but I think it would slog down a group exercise.
MarkusQ
+1  A: 

This reminds me of the Knapsack problem, more than the Traveling Salesman. But the Knapsack is also an NP-Hard problem, so you might be able to fool people to think up an over complex solution using dynamic programming if they correlate your problem with the Knapsack. Where the basic problem is:

can a value of at least V be achieved without exceeding the weight W?

Now the problem is a fairly good solution can be found when V is unique, your distances, as such:

The knapsack problem with each type of item j having a distinct value per unit of weight (vj = pj/wj) is considered one of the easiest NP-complete problems. Indeed empirical complexity is of the order of O((log n)2) and very large problems can be solved very quickly, e.g. in 2003 the average time required to solve instances with n = 10,000 was below 14 milliseconds using commodity personal computers[1].

So you might want to state that several stops/packages might share the same vj, inviting people to think about the really hard solution to:

However in the degenerate case of multiple items sharing the same value vj it becomes much more difficult with the extreme case where vj = constant being the subset sum problem with a complexity of O(2N/2N).

So if you replace the weight per value to distance per value, and state that several distances might actually share the same values, degenerate, some folk might fall in this trap.

Robert Gould
That's a very interesting idea. My presentation assumes all v(j) = 1 (all packages must be delivered, none are "more important" than the others). It may be possible to to fold this into my cost function,...
MarkusQ
A: 

You need to be clearer on whether the delivery truck has to return to base (making it a round trip), or not. If the truck does return, then a simple sort does not produce the shortest route, because the square of the return from the furthest point to base costs so much. Missing some hops on the way 'out' and using them on the way back turns out to be cheaper.

If you trick someone into a bad answer (for example, by not giving them all the information) then is it their foolishness or your deception that has caused it?

How great is the wisdom of the wise, if they heed not their ego's lies?

Phil H
The goal here isn't to trick people but to enlighten them. I'm wanting an example (or examples) of problems that tempt you into bad implementations by the way they are worded for a group exercise, not a ambush.
MarkusQ