views:

252

answers:

3

http://stackoverflow.com/questions/2293762/problem-with-arithmetic-using-logarithms-to-avoid-numerical-underflow-take-2

Having seen the above and having seen softmax normalization I was trying to normalize a vector while avoiding overflow -

that is if I have an array x[1], x[2] x[3], x[4], ... , x[n]

the normalized form for me has the sum of squares of elements as 1.0 and is obtained by dividing each element by sqrt(x[1]*x[1]+x[2]*x[2]+...+x[n]*x[n])

now the sum of squares can overflow even if the square root is small enough to fit into a floating point variable, so I imagined one could do something like s=(2*log(fabs(x[1]))+2*log(fabs(x[2]))+...+2*log(fabs(x[n])))/2

and calculating the elements as

exp(log(fabs(x[1]))-s), ..., exp(log(fabs(x[n]))-s

BUT

The above is incorrect as log(A+B) is not log(A)+log(B) - now is there a way to do vector normalization that avoids overflow better?

+2  A: 

You seem to be making the assumption that:

log(x^2 + y^2)

is the same as:

log(x^2) + log(y^2)

However, this isn't correct, as you can't simplify the logarithm of a sum like that.

interjay
+2  A: 

KennyTM is correct - your ideas about logarithms are wrong.

You can't use an L2 norm, because it requires that you calculate the magnitude of the vector, which is exactly what you're having overflow issues with.

Perhaps the L-infinity norm, where you divide each component in the vector by the absolute value of the maximum component first will be better. Be sure to hang onto that max absolute value so you can get the right magnitude back.

I understand completely that you need the L2 norm, but if overflow is indeed an issue you'll need to take intermediate steps to get it:

  1. Find the max absolute value of the vector.
  2. Divide each component by the max absolute value to normalize; max value is now +/- 1.
  3. Calculate the square root of the sum of squares of normalized components. I'd recommend sorting the values and adding them in ascending order to make sure that small components aren't lost.
  4. Multiply by the max absolute value to get the L2 norm of the original vector.
duffymo
I am afraid I need the L2 norm in this case.
muscicapa
+4  A: 

Instead of

norm  = sqrt(x[1] * x[1] + ... + x[n] * x[n])

you might want to divide the elements of the vector by the maximum possible value before squaring

max_x = max(x[1], ..., x[n])
y[1] = x[1] / max_x / n
...
y[n] = x[n] / max_x / n
norm = n * sqrt(y[1] * y[1] + ... + y[n] * y[n]) * max_x

The norm of the y vector should then be equal or smaller than zero. The value of n * max_x could still overflow, so you need to be careful there, too, that the operations are executed in a non-overflowing order.

Debilski
Dividing by n is unnecessary. The max value alone is sufficient to bring all values into the range [-1, 1]
duffymo
But for a certain `N` even a sum of ones will overflow.
Debilski