views:

207

answers:

7

This can be broken down into a simple trio of equations:

a + b = 3
b + c = 5
a + c = 4

How can I best approximate the values? Note, I'll have many more such totals and variables in real applications. Particularly, I want to find if its possible to usefully approximate the cost of food by item lists and totals from grocery receipts. I assume if I can figure out costs, there will be varying ranges of accuracy I can expect, so an extra would be knowing how likely the approximation is to be correct and within what range the price must be.

EDIT: I just don't see this getting an answer I'm comfortable with, because I failed to properly frame the question in the first place.

+3  A: 

You are dealing with a system of linear equations, to solve it:

http://www.codeproject.com/KB/cs/LinearEquationsSystemSoln.aspx

The basic idea is to build a matrix of coefficient in the equations and use LU decomposition or Gaussian Elimination to deduce the values of variables. LAPACK can help.

Follow up: LAPACK is a full featured library for dealing with all kind of linear algebra stuff (and it's damn efficient at this, there are also GPU based libraries, such as CUBLAS which runs on NVIDIA GPUs using CUDA).

If you are dealing with same number of equations than unknowns, you'll be dealing with simple equation solving solution.

Basically, if you have more unknowns than equations, you'll be dealing with something called underdetermined system. Similarly, you might be dealing with systems with more equations than unknowns (overdetermined systems). If you have an underdetermined system, you probably want to look for a minimum norm solution (there can be infinite number of solutions to a single underdetermined system). For overdetermined systems, we might be looking for so called least squares solution. For more info about these solutions, look at this: http://www.netlib.org/lapack/lug/node27.html. These concepts are from linear algebra.

Mehrdad Afshari
I am not sure this applies, but I'm still reading over the article (Thanks!). Note that I have many more variables and I don't have as many equations are combinations, and often the items appear multiple times. I am hoping this article can give me the start to accomplish that.
ironfroggy
Yes. I linked to that article to give an idea about the whole thing. The second paragraph of my answer is more relevant. Particularly, LAPACK has all kind of stuff for dealing with linear equations.
Mehrdad Afshari
*actaully he would want LINPACK, not lapack.
nlucaroni
I think LAPACK provides LINPACK functionality. To quote its home page: "The original goal of the LAPACK project was to make the widely used EISPACK and LINPACK libraries run efficiently on shared-memory vector and parallel processors."
Mehrdad Afshari
I learned about this stuff in college. I'd be very surprised if basic linear algebra concepts (especially Gaussian Elimination) were not helpful in reducing such equations.
Brian
A: 

If you're using .NET, check out Microsoft Solver Foundation. This post provides a sample of how to use it to solve linear equations.

dommer
+3  A: 

A more general solution to this problem where you could have more equations than unknowns would be to find the Linear Regression/Least Squares solution. The regression statistics will give you the information you need about the accuracy of the results.

Mark Lavin
+1  A: 

For Fun:

a+b=3 | a=3-b | a=3-(5-c) | a=c-2 | a=(4-a)-2 | 2a=2  | a=1   |       |
b+c=5 | b=5-c | b=5-c     | b=5-c | b=5-c     | b=5-c | b=5-c | b=5-3 | b=2
a+c=4 | c=4-a | c=4-a     | c=4-a | c=4-a     | c=4-a | c=4-1 | c=3   |
Sergio
Also see: http://www23.wolframalpha.com/input/?i=a+%2B+b+%3D+3%2C+b+%2B+c+%3D+5%2C+a+%2B+c+%3D+4
Jeff Moser
+3  A: 

If you have more variables than equations there will be an infinite number of exact solutions, and the possible values will be all over the place, not a useful approximation.

You need at least as many equations as variables, and substantially different ones (just copying one equation a few times won't help). So if you have fewer equations than variables it won't work, otherwise look for a statistics or linear algebra library.

sth
Do these rules still hold if I can make assumptions like all variables are >0, as they are positive they all must be less than the sum, and they are more likely closer to the average than the min or max? Over time, there will be more equations than variables, certainly.
ironfroggy
Basically you end up with something like a + b + c + d + e = 1234, which leaves a lot of possibilities even if all variables are >0...
sth
+1  A: 

For a square system like this, you can solve the matrix equation

| 1 1 0 ||a| |3|
| 0 1 1 ||b|=|5|
| 1 0 1 ||c| |4|

providing the matrix on the left is invertible, as this on is.

Generally, though, you will either have too many variables, or two few variables. Two many equations is better, since the Least-Squares approximation is available. To find the least squares solution, solve the normal equations ATA x = ATb for x, where x = (a, b,c), b = (3,5,4), and A is the coefficient matrix. Note that the superscript T refers to the transpose of the matrix.

I'm sure there is some code out there that accomplishes this.

When the system is underdetermined, though, you will have infinitely many solutions, even assuming a, b, and c are positive.

FarmBoy
I think you mean to say "Too many equations is better", right? Too many variables == underdetermined.
j_random_hacker
thanks, fixed that.
FarmBoy
A: 

For the range in which a price must be, use a Linear Programming solver in your favorite numerical algebra system. Your equality constraints are the individual equations from the receipts; in addition you will want to add inequality constraints to guarantee that each price is nonnegative. To find the possible price range of item X, set your objective function to the price of X, and solve for the minimum and maximum separately.

This approach may fail if an item can have different prices on different receipts; the linear program may become infeasible, or there might be more subtle effects.

Jouni K. Seppänen