views:

1470

answers:

9

Suppose you have a function 'normalize' which takes a list of numbers (representing a vector) as input and returns the normalized vector. What should the result be when the vector is all zeros or the sum of its components is zero?

A: 

(0,0,0) should be (0,0,0) normalized plus a warning (or exception) maybe.
mathematically it's not defined I guess.

qwerty
+13  A: 

Mathematically speaking, the zero vector cannot be normalized. Its length will always remain 0.

For given vector v = (v1, v2, ..., vn) we have: ||v|| = sqrt(v1^2 + v2^2 + ... + vn^2). Let us remember that a normalized vector is one that has ||v||=1.

So for v = 0 we have: ||0|| = sqrt(0^2 + 0^2 + ... + 0^2) = 0. You can never normalize that.

Also important to note that to ensure consistency, you should not return NaN or any other null value. The normalized form of v=0 is indeed v=0.

Yuval A
Why couldn't you? A normalised vector's size is in the range 0..1. A zero vector can very much be normalised.
Peter Perháč
Incorrect. For normalized vector v we have ||v||=1 exactly.
Yuval A
@mmyers - thanks for the edit, I miss that all the time. Did you notice we have amazingly similar stats? :)
Yuval A
Yes, I noticed it when you got the Enlightened badge earlier today. (Actually, I guess you didn't get an Enlightened badge. I wonder why not?)
Michael Myers
A: 

Well, you'd have to divide by zero, which you can't do, so I think most languages would have some sort of NaN value.

References:

  • XNA
  • Apple (you also have to pick an arbitrary direction for the vector)
  • Blender (using Python)
Chris Doggett
+2  A: 

Pretty much like 0/0. Should throw exception or return NaN.

vartec
+1  A: 

The zero vector is already normalized, under any definition of the norm of a vector that I've ever come across, so that's one case dealt with.

As for a vector with components which sum to zero - well it depends on the definition of norm that you use. With the plain old L2-norm (Euclidean distance between origin and vector) the standard formula for calculating the normalized vector should work fine since it first squares the individual components.

High Performance Mark
+4  A: 

It's even worse than Yuval suggests.

Mathematically, given a vector x you are looking for a new vector x/||x||

where ||.|| is the norm, which you are probably thinking of as a Euclidean norm with

||.|| = sqrt(dot(v,v)) = sqrt(sum_i x_i**2)

These are floating point numbers, so it's not enough to just guard against dividing by zero, you also have a floating point issue if the x_i's are all small (they may underflow and you lose the magnitude).

Basically what it all boils down to is that if you really need to be able to handle small vectors properly, you'll have to do some more work.

If small and zero vectors don't make sense in your application, you can test against the magnitude of the vector and do something appropriate.

(note that as soon as you start dealing with floating point, rather than real, numbers, doing things like squaring and then square rooting numbers (or sums of them) is problematic at both the large an small ends of the representable range)

bottom line: doing numerical work correctly over all cases is trickier than it first looks.

For example, off the top of my head potential problems with this (normalization) operation done in a naive way

  • all components (the x_i's) too small
  • any single component too large (above square root of max representable) will return infinity. This cuts the available magnitudes componentwise by sqrt .
  • if the ratio of a large component to a small component is too large, you can effectively lose the small components direction if you aren't careful
  • etc.
simon
A: 

As has already been mentioned several times, you can't normalize a zero vector. So, your options are:

  1. Return the zero vector
  2. Return NaN
  3. Return a bit indicating if the vector was successfully normalized, in addition to the result if successful
  4. Throw an exception

Option 4 is not very good because some languages (such as C) don't have exceptions, and normalizing a vector is typically found in very low-level code. Throwing an exception is rather expensive, and any code that may want to handle the zero/small vector case is going to be given an unnecessary performance hit when that happens.

Option 1 has the problem that the return value won't have a unit length, and so it could silently introduce bugs in the calling code that assumes the resulting vector has unit length.

Option 2 has a similar problem to option 1, but because NaNs are usually much more noticeable than zeroes, it will likely manifest more easily.

I think Option 3 is the best solution, although it does make the interface more complicated. Instead of saying

vec3 = myVec.normalize();

You now have to say something like

vec3 result;
bool success = myVec.normalize(&result);
if(success)
    // vector was normalized
else
    // vector was zero (or small)
Adam Rosenfield
A: 

Given a vector v, to normalize it means keeping its direction and making it unit-length by multiplying it by a well-chosen factor.

This is clearly impossible for the zero vector, because it does not really have a direction, or because its length cannot be changed by mutltiplying it by some factor (it will always be equal to zero).

I would suggest that whatever procedure you would like to use your vector for, and that requires this vector to be normalized, is not well-defined for zero vectors.

Fanfan
+2  A: 

Hello.

Mathematically speaking, the zero vector cannot be normalized. This is an example of what we call in computational geometry a "degenerate case", and this is a huge topic, making much headache for geometry algorithm designers. I can imagine the following approaches to the problem.

  1. You do not make anything special about the zero vector case. If your vector type has floating point typed coordinates, then you will get zero or infinite coordinates in the result (due to dividing by zero).
  2. You throw a degenerate_case_exception.
  3. You introduce a boolean is_degenerate_case output parameter to your procedure.

Personally I in my code use the 3 approach everywhere. One of its advantages is that it does not let the programmer to forget to deal with degenerate cases.

Note, that due to the limited range of floating point numbers, even if the input vector is not equal to the zero vector, You may still get infinite coordinates in the output vector. Because of this, I do not consider the 1. approach to be a bad design decision.

What I can recommend You is to avoid the exception throwing solution. If the degenerate cases are rare among the others, then the exception throwing will not slow down the program. But the problem is that in most cases You can not know that degenerate cases will be rare.

Zoli