Minimizing a function is like trying to find lowest point on a surface. Think of yourself walking on a hilly surface and that you are trying to get to the lowest point. You would find the direction that goes downhill and walk until it doesn't go downhill anymore. Then you would chose a new direction that goes downhill and walk in that direction until it doesn't go downhill anymore, and so on. Eventually (hopefully) you would reach a point where no direction goes downhill anymore. You would then be at a (local) minimum.
The LM algorithm, and many other minimization algorithms, use this scheme.
Suppose that the function being minimized is F and we are at the point x(n) in our iteration. We wish to find the next iterate x(n+1) such that F(x(n+1)) < F(x(n)), i.e. the function value is smaller. In order to chose x(n+1) we need two things, a direction from x(n) and a step size (how far to go in that direction). The LM algorithm determines these values as follows -
First, compute a linear approximation to F at the point x(n). It is easy to find out the downhill direction of a linear function, so we use the linear approximating function to determine the downhill direction.
Next, we need to know how far we can go in this chosen direction. If our approximating linear function is a good approximation for F for a large area around x(n), then we can take a fairly large step. If it's a good approximation only very close to x(n), then we can take only a very small step.
This is what LM does - calculates a linear approximation to F at x(n), thus giving the downhill direction, then it figures out how big a step to take based on how well the linear function approximates F at x(n). LM figures out how good the approximating function is by basically taking a step in the direction thus determined and comparing how much the linear approximation to F decreased to the how much the the actual function F decreased. If they are close, the approximating function is good and we can take a little larger step. If they are not close then the approximation function is not good and we should back off and take a smaller step.