views:

109

answers:

2

I've written some basic graphing software in Clojure/Java using drawLine() on the graphics context of a modified JPanel. The plotting itself is working nicely, but I've come to an impasse while trying to converting a clicked pixel to the nearest data point.

I have a simple bijection between the list of all pixels that mark end points of my lines and my actual raw data. What I need is a surjection from all the pixels (say, 1200x600 px^2) of my graph window to the pixels in my pixel list, giving me a trivial mapping from that to my actual data points.

e.g.

<x,y>(px) ----> <~x,~y>(pixel points) ----> <x,y>(data)

This is the situation as I'm imagining it now:

  • A pixel is clicked in the main graph window, and the MouseListener catches that event and gives me the <x,y> coordinates of the action.

  • That information is passed to a function that returns a predicate which determines whether or not a value passed to it is "good enough", and filter though the list with that pred, and take the first value it okays.

    • Possibly, instead of a predicate, it returns a function which is passed the list of the pixel-points, and returns a list of tuples (x index) which indicate how good the point is with the magnitude of x, and where that point is with index. I'd do this with both the x points and the y points. I then filter though that and find the one with the max x, and take that one to be the point which is most likely to be the one the user meant.

Are these reasonable solutions to this problem? It seems that the solution which involves confidence ratings (distance from pix-pt, perhaps) may be too processor heavy, and a bit memory heavy if I'm holding all the points in memory again. The other solution, using just the predicate, doesn't seem like it'd always be accurate.

This is a solved problem, as other graphing libraries have shown, but it's hard to find information about it other than in the source of some of these programs, and there's got to be a better way then to dig through the thousands of lines of Java to find this out.

I'm looking for better solutions, or just general pointers and advice on the ones I've offered, if possible.

Thank you so much!

Isaac

+1  A: 

So I'm guessing something like JFreeChart just wasn't cutting it for your app? If you haven't gone down that road yet, I'd suggest checking it out before attempting to roll your own.

Anyway, if you're looking for the nearest point to a mouse event, getting the point with the minimum Euclidean distance (if it's below some threshold) and presenting that will give the most predictable behavior for the user. The downside is that Euclidean distance is relatively slow for large data sets. You can use tricks like ignoring the square root or BSP trees to speed it up a bit. But if those optimizations are even necessary really depends on how many data points you're working with. Profile a somewhat naive solution in a typical case before going into optimization mode.

j flemm
JFreeChart would've added a layer of complexity that I didn't want to deal with, but mostly it was too slow. I'm redrawing thousands of points multiple times a second. Euclidean distance makes sense; it's essentially what I'm toying with right now, but reifying it into a method–sqrt (sq xdiff) (sq ydiff)–is helpful. Thank you!
Isaac Hodes
@Isaac: Wow. So you're actually redrawing all those points multiple times a second? It might be worth looking into using an accelerated surface if you aren't already.Also: if you're searching thousand of points and want an interactive response some sort of spacial optimization like BSP trees is going to be a necessity. It's also going to be major bummer to write. BSP does parallelize relatively well though.Please keep us updated as you drill down to a solution. I'm interested in what you end up with.
j flemm
Thanks for the tip re: accelerate surfaces; when I work on speeding up the drawing, I might give that a look. Right now, it isn't too much of a problem. I'm generally not redrawing more than a few thousand points a second, and that isn't too much to deal with. As for BSPs, from the Wikipedia article, it doesn't sound like their particularly applicable: perhaps I'm missing something?
Isaac Hodes
Finally, Euclidean distance isn't working as well as I thought, as the distance between x-pts is always a fixed interval, and it's a small interval, but the difference between two y points can vary wildly (it's a graph of voltages over time, so that makes sense.) The issue is then that the y-vals are getting far more credence than the x-vals, though in reality they should be getting the same relative credence. Normalizing the ys doesn't seem to help much either… I'll report back with more results! Thanks for the help :)
Isaac Hodes
@Isaac: Re: BSP: All a BSP tree would do is allow you to partition your space so you aren't checking every point for every click. Since each child region is nested within a larger parent region, you just find the child leaf and go up the tree however many steps that are required to be sure you've found the closest point. Re: Euclidean distance: Since it's fixed x interval and the y fluctuates wildly, Euclidean probably isn't the way to go. A 1-D Manhattan along the x-axis is probably better. Maybe check the two left and right neighboring points and do some y-axis arbitration for obvious cases.
j flemm
Ah, interesting. I might consider it, though it may not be necessary, as the resolution of a space having too many point to process is so low (more pts than pxs), rending the whole method inaccurate and unlikely to be used anyway. I might do it anyway, though, it can't hurt. Re: 1-D Manhattan et al, I'm doing that now, and it's working decently. I'm going to add some heuristics to catch the cases where y's are obvious, as well. Thanks!
Isaac Hodes
A: 

I think your approach is decent. This basically only requires one iteration through your data array, a little simple maths and no allocations at each step so should be very fast.

It's probably as good as you are going to get unless you start using some form of spatial partitioning scheme like a quadtree, which would only really make sense if your data array is very large.

Some Clojure code which may help:

(defn squared-distance [x y point]
  (let [dx (- x (.x point))
        dy (- y (.y point))]
     (+ (* dx dx) (* dy dy))))

(defn closest 
  ([x y points]
    (let [v (first points)] 
      (closest x y (rest points) (squared-distance x y v) v)))
  ([x y points bestdist best]
    (if (empty? points)
      best
      (let [v (first points)
            dist (squared-distance x y v)] 
        (if (< dist bestdist)
          (recur x y (rest points) dist v)
          (recur x y (rest points) bestdist best))))))
mikera
Thanks! Distance isn't working too well for me, in this case, but the spacial partition may–I'll look into it.
Isaac Hodes