If my goal is speed, and the number of points weren't huge (millions,) I'd focus on memory footprint as much as algorithmic complexity.
An unbalanced k-d tree is best on paper, but it requires pointers, which can expand memory footprint by 3x+, so it is out.
A balanced k-d tree requires no storage, other than for an array with one scalar for each point. But it too has a flaw: the scalars can not be quantized - they must be the same 32 bit floats as in the original points. If they are quantized, it is no longer possible to guarantee that a point which appears earlier in the array is either on the splitting plane, or to its left AND that a point which appears later in the array is either on the splitting plane, or to its right.
There is a data structure I developed that addresses this problem. The Synergetics folks tell us that volume is experientially four-directional. Let's say that a plane is likewise experientially three-directional.
We're accustomed to traversing a plane by the four directions -x, +x, -y, and +y, but it's simpler to use the three directions a, b, and c, which point at the vertices of an equilateral triangle.
When building the balanced k-d tree, project each point onto the a, b, and c axes. Sort the points by increasing a. For the median point, round down, quantize and store a. Then, for the sub-arrays to the left and right of the median, sort by increasing b, and for the median points, round down, quantize, and store b. Recurse and repeat until each point has stored a value.
Then, when testing a circle (or whatever) against the structure, first calculate the maximum a, b, and c coordinates of the circle. This describes a triangle. In the data structure we made in the last paragraph, compare the median point's a coordinate to the circle's maximum a coordinate. If the point's a is larger than the circle's a, we can disqualify all points after the median. Then, for the sub-arrays to the left and right (if not disqualified) of the median, compare the circle's b to the median point's b coordinate. Recurse and repeat until there are no more points to visit.
This is similar in theme to the BIH data structure, but requires no intervals of -x and +x and -y and +y, because a, b, and c are just as good at traversing the plane, and require one fewer direction to do it.