views:

203

answers:

3

Currently I am interning at a software company and one of my tasks has been to implement the recognition of mouse gestures. One of the senior developers helped me get started and provided code/projects that uses the $1 Unistroke Recognizer http://depts.washington.edu/aimgroup/proj/dollar/. I get, in a broad way, what the $1 Unistroke Recognizer is doing and how it works but am a bit overwhelmed with trying to understand all of the internals/finer details of it.

My problem is that I am trying to recognize the gesture of moving the mouse downards, then upwards. The $1 Unistroke Recognizer determines that the gesture I created was a downwards gesture, which is infact what it ought to do. What I really would like it to do is say "I recognize a downards gesture AND THEN an upwards gesture."

I do not know if the lack of understanding of the $1 Unistroke Recognizer completely is causing me to scratch my head, but does anyone have any ideas as to how to recognize two different gestures from moving the mouse downwards then upwards?

Here is my idea that I thought might help me but would love for someone who is an expert or even knows just a bit more than me to let me know what you think. Any help or resources that you know of would be greatly appreciated.

How My Application Currently Works:

The way that my current application works is that I capture points from where the mouse cursor is while the user holds down the left mouse button. A list of points then gets feed to a the gesture recognizer and it then spits out what it thinks to be the best shape/gesture that cooresponds to the captured points.

My Idea:

What I wanted to do is before I feed the points to the gesture recognizer is to somehow go through all the points and break them down into separate lines or curves. This way I could feed each line/curve in one at a time and from the basic movements of down, up, left, right, diagonals, and curves I could determine the final shape/gesture.

One way I thought would be good in determining if there are separate lines in my list of points is sampling groups of points and looking at their slope. If the slope of a sampled group of points differed X% from some other group of sampled points then it would be safe to assume that there is indeed a separate line present.

What I Think Are Possible Problems In My Thinking:

  • Where do I determine the end of a line and the start of a separate line? If I was to use the idea of checking the slope of a group of points and then determined that there was a separate line present that doesn't mean I nessecarily found the slope of a separate line. For example if you were to draw a straight edged "L" with a right angle and sample the slope of the points around the corner of the "L" you would see that the slope would give resonable indication that there is a separate line present but those points don't correspond to the start of a separate line.

  • How to deal with the ever changing slope of a curved line? The gesture recognizer that I use handles curves already in the way I want it too. But I don't want my method that I use to determine separate lines keep on looking for these so called separate lines in a curve because its slope is changing all the time when I sample groups of points. Would I just stop sampling points once the slope changed more than X% so many times in a row?

  • I'm not using the correct "type" of math for determining separate lines. Math isn't my strongest subject but I did do some research. I tried to look into Dot Products and see if that would point me in some direction, but I don't know if it will. Has anyone used Dot Prodcuts for doing something like this or some other method?

Final Thoughts, Remarks, And Thanks:

Part of my problem I feel like is that I don't know how to compeletly ask my question. I wouldn't be surprised if this problem has already been asked (in one way or another) and a solution exist that can be Googled. But my search results on Google didn't provide any solutions as I just don't know exactly how to ask my question yet. If you feel like it is confusing please let me know where and why and I will help clarify it. In doing so maybe my searches on Google will become more precise and I will be able to find a solution.

I just want to say thanks again for reading my post. I know its long but didn't really know where else to ask it. Imma talk with some other people around the office but all of my best solutions I have used throughout school have come from the StackOverflow community so I owe much thanks to you.

Edits To This Post:

(7/6 4:00 PM) Another idea I thought about was comparing all the points before a Min/Max point. For example, if I moved the mouse downards then upwards, my starting point would be the current Max point while the point where I start moving the mouse back upwards would be my min point. I could then go ahead and look to see if there are any points after the min point and if so say that there could be a new potential line. I dunno how well this will work on other shapes like stars but thats another thing Im going to look into. Has anyone done something similar to this before?

+1  A: 

If your problem can be narrowed down to breaking apart a general curve into straight or smoothly curved partial lines then you could try this.

Comparing the slope of the segments and identifying breaking points where it is greater then some threshold would work in a very simplified case. Imagine a perfectly formed L-shape where you have a right angle between two straight lines. Obviously the corner point would be the only one where the slope difference is above the threshold as long as the threshold is between 0 and 90 degrees, and thus a identifiable breaking point.

However, the vertical and horizontal lines may be slightly curved so the threshold would need to be large enough for these small differences in slope to be ignored as breaking points. You'd also have to decide how sharp a corner the algorithm should pick up as a break. is 90 deg or higher required, or is even 30 deg enough? This is an important question.

Finally, to make this robust I would not be satisfied comparing the slopes of two adjacent segments. Hands may shake, corners may be smoothed out and the ideal conditions to find straight lines and sharp corners will probably never occur. For each point investigated for a break I would take the average slope of the N previous segments and compare it to the average slope of the N following segments. This can be efficiently implemented using a running mean. By choosing a good sample number N (depending on the accuracy of the input, the total number of points, etc) the algorithm can avoid the noise and make better detections.

Basically the algorithm would be:

  • For each investigated point (beginning N points into the sequence and ending N points before the end.)
    • Compute average slope of the N previous segments.
    • Compute average slope of the N next segments.
    • If the difference of the averages is greater than the Threshold, mark current point as a breaking point.

This is quite off the top of my head. You'd have to try it in your application.

Staffan E
I definately like your idea. I know that for a fact I will be able to recognize lines that have a corner that is between 10 to 170 degrees. Pretty much from a humans stand point if they can distinguish that there is some corner (whether or not the lines are smooth) then I need to be able to distinguish seperate line segments. As far as a good N I guess I will just have to experiment? Currently my specification/design provides a list of points no more or less than 64 entries. Also does it matter what points I use for slope? IE can I use the 1st and 5th point to take a slope?
Chris
Yes, N would have to be picked by trial and error. If it's too small you will pick up small jiggles as separate lines and miss curved corners that should have been broken off. If it's too large on the other hand you might break up smoothly curved lines and miss localized corners like the middle one in a curly brace ({). Keep it tweakable and you'll surely find some suitable setting. As for slope I would use only the adjacent points. (1st, 2nd), (2nd, 3rd), etc. By using the N-average the slopes farther away from the inspection point will naturally be weighed into the result.
Staffan E
+1  A: 

if you work with absolute angles, like upwards and downwards, you can simply take the absolute slope between two points (not necessarily adjacent) to determine if it's RIGHT, LEFT, UP, DOWN (if that is enough of a distinction)

the art is to find a distance between points so that the angle is not random (with 1px, the angle will be a multiple of 45°)

There is a firefox plugin for Navigation using mouse gestures that works very well. I think it's FireGestures, but I'm not sure. I guess you can get some inspiration from that one

Additional thought: If you draw a shape by connectiong successive points, then connecting back to the first point, the ratio between the area and the final line segment's length is also an indicator for the gesture's "edginess"

Silly Freak
My angles aren't in fact absolutely as I can test and recognize a diagonal line gesture. I think thats what you're asking? I don't think I entirely understand your post when you also say the art is to find a distance between points so that the angle is not random. Ill also look into the FireFox plugin as well. Thanks again! :D
Chris
I meant it doesn't matter which rotation the gesture is. you just have to know if a line goes down, not if the overall gesture forms some complex shape like a half circle, which could be upper or lower half.My second statement was that if distances are too small, the user can't precisely control what direction he will go. E.g. if you draw a line straight down, you will be a few pixel off. If you treat every two points as individual lines, those where you got wrong will be 45 degrees off. But if your line segments are too long, you don't see edges. Your job's to choose balanced distances
Silly Freak
+1  A: 

If you are just interested in up/down/left/right, a first approximation is to check 45 degree segments of a circle. This is easily done by checking the the horizontal difference between (successive) points against the vertical difference between points.

Say you have a greater positive horizontal difference than vertical difference, then that would be 'RIGHT'.

The only difficulty then comes for example, in distinguishing UP/DOWN from UP/RIGHT/DOWN. But this could be done by distances between points. If you determine that the mouse has moved RIGHT for less than 20 pixels say, then you can ignore that movement.

Jonathan Swift