views:

621

answers:

7

This is kind of more generic question, isn't language-specific. More about idea and algorithm to use.

The system is as follows:

It registers small loans between groups of friends. Alice and Bill are going to lunch, Bill's card isn't working, so Alice pays for his meal, $10.
The next day Bill and Charles meet each other on a railway station, Chales has no money for ticket, so Bill buys him one, for $5. Later that day Alice borrows $5 from Charles and $1 from Bill to buy her friend a gift.

Now, assuming they all registered that transactions in the system, it looks like this:

Alice -> Bill $10
Bill -> Alice $1
Bill -> Charles $5
Charles -> Alice $5

So, now, only thing that needs to be done is Bill giving Alice $4 (he gave her $1 and Charlie transferred his $5 to Alice alredy) and they're at the initial state.

If we scale that to many diffrent people, having multiple transaction, what would be the best algorithm to get as little transactions as possible?

A: 

If you take states as nodes of graph then you will be able to use shortest path algorithm to know the answer.

TheVillageIdiot
That's technically true, but unless you have a pruning algorithm, it will take a worst case of exponential time to the number of states in order to complete, assuming that testing each state is O(1).
Brian
A: 

You should be able to solve this in O(n) by first determining how much each person owes and is owed. Transfer the debts of anyone who owes less than he is owed to his debtors (thus turning that person into an end point). Repeat until you can't transfer any more debts.

Elie
This is O(1), but may result in as much as 2*OPT transfers, since in the best case a transfer eliminates 2 people from consideration but your algorithm may sometimes only eliminate 1 person from consideration.
Brian
I'm not sure where you got O(1) from. You have to go through all the people (hence the O(n) factor). However, the reduction will eliminate only 1 person each time (and exactly one person each time).
Elie
+2  A: 

Well, the first step is to totally ignore the transactions. Just add them up. All you actually need to know is the net amount of debt a person owes/owns.

You could very easily find transactions by then creating a crazy flow graph and finding max flow. Then a min cut...

Some partial elaboration: There is a source node, a sink node, and a node for each person. There will be an edge between every pair of nodes except no edge between source node and sink node. Edges between people have infinite capacity in both directions. Edges coming from source node to person have capacity equal to the net debt of the person (0 if they have no net debt). Edges going from person node to sink node have capacity equal to the net amount of money that person is owed (0 if they have no net owed).

Apply max flow and/or min cut will give you a set of transfers. The actual flow amount will be how much money will be transfered.

Brian
Could you elaborate on how max flow could be used here? What are the nodes? What are the edges? I'm -1ing just to get your attention, a good explanation will flip the sign :)
j_random_hacker
@j_random_hacker: I added some elaboration. That said, I admit I am unsure how good the answers this gives will be. I invite you to analyze further :)
Brian
Well, I've offered a graph-based algorithm, which probably doesn't give optimal results. Anyone care to offer a linear programming solution and a dynamic programming solution?
Brian
Thanks Brian. But the fact that capacities on the person-to-person edges is infinite means that many suboptimal solutions could be produced. E.g. if there are 4 people, a owes b $20 and c owes d $20, then max flow *might* find the optimal solution (20 on the edge a->b and 20 on the edge c->d), but putting $17.50 on edges a->b and c->d, and $2.50 on a->d and c->b, is also a maximum flow.
j_random_hacker
We somehow need to reduce the *number* of edges with nonzero flow -- I looked at Minimum Cost Flow, but that seems to have the same problem since the per-edge cost is multiplied by the flow through that edge (we want a fixed per-edge cost if there is *any* flow in that edge).
j_random_hacker
+1  A: 

Only if someone owes more than 2 people, whom also owe to the same set, can you reduce the number of transactions from the simple set.

That is, the simple set is just find each balance and repay it. That's no more than N! transactions.

If A owes B and C, and some subset of B C owe each other, so B owes C, then instead of: A -> B, A -> C (3 transactions). You'd use: A -> B, B -> C (2 transactions).

So in other words you are building a directed graph and you want to trim vertices on order to maximize path length and minimize total edges.

Sorry, I don't have an algorithm for you.

Adam Luter
+5  A: 

This actually looks like a job for Superman, oops, I mean double entry accounting.

Your transactions should be structured as bookkeeping entries thus:

                          Alice  Bill  Charles  Balance
Alice   -> Bill    $10      10    10-       0        0
Bill    -> Alice    $1       9     9-       0        0
Bill    -> Charles  $5       9     4-       5-       0
Charles -> Alice    $5       4     4-       0        0

And there you have it. At each transaction, you credit one ledger account and debit another so that the balance is always zero. At at the end, you simply work out the minimal number transactions to be applied to each account to return it to zero.

For this simple case, it's a simple $4 transfer from Bill to Alice. What you need to do is to reduce at least one account (but preferably two) to zero for every transaction added. Let's say you had the more complicated:

                          Alice  Bill  Charles  Balance
Alice   -> Bill    $10      10    10-       0        0
Bill    -> Alice    $1       9     9-       0        0
Bill    -> Charles  $5       9     4-       5-       0
Charles -> Alice    $5       4     4-       0        0
Charles -> Bill     $1       4     5-       1        0

Then the transactions needed would be:

Bill     -> Alice   $4       0     1-       1        0
Bill     -> Charles $1       0     0        0        0

Update:

There are some states where a simple greedy strategy does not generate the best solution (kudos to j_random_hacker for pointing this out). One example is:

                 Alan  Bill  Chas  Doug  Edie  Fred  Bal
Bill->Alan   $5    5-    5     0     0     0     0    0
Bill->Chas  $20    5-   25    20-    0     0     0    0
Doug->Edie   $2    5-   25    20-    2     2-    0    0
Doug->Fred   $1    5-   25    20-    3     2-    1-   0

Clearly, this could be reversed in four moves (since four moves is all it took to get there) but, if you choose your first move unwisely (Edie->Bill $2), five is the minimum you'll get away with.

You can solve this particular problem with the following rules:

  • (1) if you can wipe out two balances, do it.
  • (2) otherwise if you can wipe out one balance and set yourself up to wipe out two in the next move, do it.
  • (3) otherwise, wipe out any one balance.

That would result in the following sequence:

  • (a) [1] not applicable, [2] can be achieved with Alan->Bill $5.
  • (b) [1] can be done with Chas->Bill $20.
  • (c) and (d), similar reasoning with Doug, Edie and Fred, for four total moves.

However, that works simply because of the small number of possibilities. As the number of people rises and the group inter-relations becomes more complex, you'll most likely need an exhaustive search to find the minimum number of moves required (basically the rules 1, 2 and 3 above but expanded to handle more depth).

I think that is what will be required to give you the smallest number of transactions in all circumstances. However, it may be that that's not required for the best answer (in terms of "bang per buck"). It may be that even the basic 1/2/3 rule set will give you a good-enough answer for your purposes.

I'm reminded of people complaining that calculating PI with the formula 355/113 isn't accurate. In fact, it has an error of about one part in 125 million so that the calculation of the circumference of a circle as big as the whole Earth would only be out by about the length of your average car.

Back-of-the-envelope proof follows for those of a more disbelieving nature:

  2 x PI x R (for R = 6300km)       2 x (355/113) x R
= 2 x PI x 6,300,000m             = 2 x (355/113) x 6,300,000m
= 39,584,067.44m                  = 39,584,070.8m
paxdiablo
Sure, if m is the number of nonzero balances then at best m/2 transactions will suffice and at worst m will be needed, but the devil's in that last part -- how do you minimise the number of steps? I'm not convinced that a greedy approach will always work.
j_random_hacker
@j_random_hacker. There is no single transaction that can reduce more than two people to zero balance (as you've said) so, as long as you use the greedy method (zero two people) where possible or zero one person if greedy is not possible, that's the minimum number of moves. Because addition is commutative, it makes no difference what order thing are done in (within those rules), just make sure no-one leaves zero balance once they've been put there.
paxdiablo
@Pax: No, your simple greedy approach isn't optimal for all cases. Consider these balances: a=-5 b=25 c=-20 d=3 e=-2 f=1. If you cancel a with b first, then you can finish in a total of 4 moves because the pair (b=20, c=-20) appears; but some choices that your greedy algo could make will result in 5 moves (e.g. f->b, e->b, d->a, a->b, c->b) because no such "matched pair" appears.
j_random_hacker
Basically, whenever the balances can be broken into k "independent subproblems", you can solve each one in one step less than the number of balances in that subproblem, reducing the total number of steps needed by k -- but *finding* those subproblems is very hard. -1 for now (to get your attention :).
j_random_hacker
I don't think your argument makes sense (but that may be because I didn't clearly explain double-entry accounting). You would never transfer money from f to b (or b to f) because they're both on the positive side of the ledger. In addition, your initial values are not balanced nor does your sequence make sense - the initial transfer f->b should have set b's value to 0 so it would not be followed by a e->b transfer. If you can come up with a valid sequence indicating the problem, I'll rework the answer.
paxdiablo
@j_random_hacker, after further investigation (finding a sequence that breaks the greedy algorithm), I've updated the answer to (hopefully) incorporate your concerns.
paxdiablo
@Pax: Whoops, yes "f=1" in my example should have been "f=-1". (Sorry that must have been quite confusing!) Your update's good, and yes, based on the NP-hardness of the Subset Sum Problem I would guess that (something amounting to) an exhaustive search is actually needed to guarantee optimality. For practical cases your heuristic will do just fine of course.
j_random_hacker
+3  A: 

Intuitively, this sounds like an NP-complete problem (it reduces to a problem very like bin packing), however the following algorithm (a modified form of bin packing) should be pretty good (no time for a proof, sorry).

  1. Net out everyone's positions, i.e. from your example above:

    Alice = $4 Bill = $-4 Charles = $0

  2. Sort all net creditors from highest to lowest, and all debtors from lowest to highest, then match by iterating over the lists.

  3. At some point you might need to split a person's debts to net everything out - here it is probably best to split into the biggest chunks possible (i.e. into the bins with the most remaining space first).

This will take something like O(n log n) (again, proper proof needed).

See the Partition Problem and Bin Packing for more information (the former is a very similar problem, and if you limit yourself to fixed precision transactions, then it is equivalent - proof needed of course).

jwoolard
+1. Amazing that this answer, which is actually the closest to being correct, gets voted down. If an unbounded number of transactions are allowed to take place, then yes, this problem is equivalent to solving multiple subset-sum problems -- every subset of m nonzero balances that sums to 0 can be resolved in m-1 steps. See my (upcoming) answer for more details.
j_random_hacker
Agree 100%. n-log-n due to the sort requirement. Just want to note explicitly that this only becomes a bin packing problem when there is some probability that payments may be *exactly* matched, thereby reducing the total number of required transactions. To the extent that payment granularity departs from some small set of integers and approaches something more like the real number set, simple sorting and netting is as good as it gets.
John Pirie
+2  A: 

I found a practical solution which I implemented in Excel:

  • find out who ows the most

  • let that person pay the complete amount he ows to the one who should get the most

  • that makes the first person zero

  • repeat this proces taken into account that one of (n-1) persons has a changed amount

It should result in a maximum of (n-1) transfers and the nice thing about it is that no one is doing more than one payment. Take the (modified) example of jrandomhacker:

a=-5 b=25 c=-20 d=3 e=-2 f=-1 (sum should be zero!)

  1. c -> b 20. result: a=-5 b=5 c=0 d=3 e=-2 f=-1

  2. a -> b 5 result: a=0 b=0 c=0 d=3 e=-2 f=-1

  3. e -> d 2 result a=0 b=0 c=0 d=1 e=0 f=-1

  4. f -> d 1

Now, everyone is satisfied and no one is bothered by making two or more payments. As you can see, it is possible that one person recieves more than one payment (person d in this example).

jvreeburg