I read the article mentioned above. I have the math background to implement it, but not the patience. If you are going to use Newton's method to approximate something, why go to the bother of doing the derivatives, the algebra, and all that as well?
Here is my idea:
1) Assume ellipse is (x/a)^2 + (y/b)^2 = 1
That is, the origin is at zero and the rotation angle is zero.
(Transform your ellipse and point of interest P1 by rotation and translation if necessary before beginning.)
2) Convert the ellipse into parametric form.
x = a cos t
y = b sin t
3) Divide the parametric range of "t" into N parts, from 0 to 2*PI.
N = 16 seems like a good number (22.5 degrees).
4) Iterate through the N points along the ellipse, stepping t by 2*PI/N each time.
5) Compute the point P2 on the ellipse.
6) Compute the distance from P1 to P2.
7) Keep track of the closest distance dmin found so far, and the value of the parameter as tmin.
8) Start a second loop, but shrink the range for t to being (tmin - 2*PI/N, tmin + 2*PI/N).
9) Repeat the search, dividing this smaller range by N again.
10) Add new loops as necessary until the distance between successive points is less than the tolerance that matters to you.
Approximating the ellipse as a circle with the major radius, since C = 2*PI*R, successive points being tested will be one pixel apart when N = 2*PI*R.
Using 16 steps per loop, the number of points to be compared is 16*log8(PI*R).
As a further refinement, I would reflect the point into the first coordinate (positive x & y). That saves computations against 3/4 of the ellipse.
The above algorithm leads to much simpler code: easier to write and debug. Efficiency is another matter. For many applications, it should be good enough.