views:

229

answers:

7

How can I encrypt/decrypt with fuzzy tolerance?

I want to be able to use a Stroke on an InkCanvas as key for my encryption but when decrypting again the user should not have to draw the exact same symbol, only similar. Can this be done in .NET C#?

--- Update (9 sep) ---

What I ideally want is an encryption algorithm that would accept any key in a certain range of keys based on some base-key and a function defining the allowed differences ..

Im doing all encryption/decryption locally so I wont need to send anything over a wire safely. And I dont want to store the key used for encryption, so I wont have anything to compare with. I could come up with some method to generate the same key for every similar stroke but its not easy if a want to accept any kind of symbol (not only letters). The other option is if the encryption key somehow could accept similar keys by design, which I dont know if its possible...?

A: 

Use some form of OCR (optical character recognition) to convert the Stroke into regular text, and then use that text as the key. As long as the user draws anything that gets OCR-ed into the exact same text, they'll be able to decrypt again.

MusiGenesis
Its a good idea but I also want to use any kind of symbol, not only known letters.
Andreas Zita
A: 

If you encode your strokes as images, there are fuzzy algorithms for detecting similarity between images. Of course, if you were to use such an approach, you should use a 2-way encryption method rather than a 1-way hash so that the original image is retrievable.

kbrimington
Thanks, I considered that... but then I would need to store the key used for encryption as well...
Andreas Zita
+1  A: 

There are a number of encryption schemes that allow fuzzy secrets. Frequently, these schemes are developed to protect secrets with biometric information (i.e. fingerprints, retina scans), but the underlying schemes are more generally applicable. One example for such a scheme is a fuzzy vault scheme proposed by Juels and Sudan.

Accipitridae
Thanks ... I will check into it.
Andreas Zita
DomQ
@DomQ, I'm aware of this paper. It is more about potential attacks than concrete implementations. The most practical idea says, that if you use the same biometric info (e.g. since it is hard to change your fingerprints) then the scheme might eventually leak your secrets. But here this is less of a concern. All a user has to do is to use different pen strokes, whenever they generate a new key.
Accipitridae
A: 

The problem here is that the key that is used for encryption (the Stroke or something derived from the Stroke) must be sent along with the encrypted message. Otherwise, there is no way for the decryption protocol to compare the decryption stroke with the original.

So suppose Alice wants to encrypt some message M. She is the only person who should decrypt the message, so Bob is not in the picture. Alice generates a fuzzy encryption key Ke and encrypts M to become Me: Encrypt(M,Ke) = Me. The message that is sent is (Ke,Me). On the receiving side, Alice produces fuzzy decryption key Kd. Some algorithm checks that Ke ~ Kd and Me is decrypted: Decrypt(Me,Ke) = M. Note that this is symmetric encryption; Kd is only used to check that it is 'sufficiently equal' to Ke.

Of course the problem is that Ke is sent in cleartext along with the message so Eve can simply take the key and decrypt Me. However, Ke is needed on the receiving side to compare it against Kd. So how do we send Ke along with the message without Eve being able to eavesdrop? We could create a hash of Ke and send that over the line: (Hash(Ke),Me). However, on the receiving side there would be no way to verify 'sufficient equality' between Ke and Kd based on Hash(Ke).

We could device some algorithm that generates a value based on Ke such that if Ke ~ Kd -> V(Ke) ~ V(Kd) (if Ke and Kd are similar then the generated values are similar). We send the message (V(Ke),Me) to the receiver. However, it would now be relatively easy for Eve to determine Ke based on V(Ke). She starts with a random candidate key: KeC and using our algorithm, determines a V(KeC). If it doesn't look at all like the V(Ke) in the message, she makes some drastic changes to the candidate KeC and tries again. As she comes closer to the message's V(Ke) she makes smaller changes to KeC, etc.

So it is impossible to create a secure encryption scheme if we allow Ke to be sent along with the message. This means that Ke has to be given to Trent, the trusted third party. In this case, Trent can be the application's database. So now the scheme becomes as follows:

Alice generates Ke, a message M and a unique id Id. Ke is stored with Trent, our database, along with Id. M is encrypted using a regular encryption scheme that uses Ke as key: Me = Encrypt(M,Ke). The message sent to the receiver is (Me,Id).

On the receiving side, Alice receives the message (Me,Id). Alice generates Kd. Based on Id, we get the corresponding Ke from Trent and compare it with Kd. If there is a match, we decrypt Me: M = Decrypt(Me,Ke).

The only problem now is when you have an intruder Mallory with access to Trent. He can ask Trent for Ke values based on random id's. To prevent this, you shouldn't include Id into the message so that the message simply becomes (Me). Now you would have to come up with a strategy to get candidate Ke's from Trent only by using Kd. This is of course possible because you can compare Kd with all Ke's in the database, return the most 'similar' Ke and try that as the decryption key. This strategy assumes that each person's Stroke (or Ke) is sufficiently different.

The strategy above is borrowed from biometric encryption where you store biometric data in a database and use that to either identify or authenticate individuals. Try searching Google for biometric encryption to get some additional info.

Ronald Wildenberg
You misunderstand. There is no plan to send Ke.Ke is simply the pen stroke equivalent of a password. Think of password-based encryption. To send messages to recipients you would use public key cryptography, and then the password analogy would extend to protecting the private key with a pen stroke-based key. You are correct in that is very similar to biometrics.
GregS
But the point is that the original pen stroke `Ke` must be known on the receiving side. Otherwise you can never determine whether the pen stroke on the receiving side is similar to it. You can accomplish this either by sending it along (which is a bad idea, as I point out) or by storing it in a database.
Ronald Wildenberg
@Ronald Wildenberg: The idea is to have some function that generates an encryption key from a pen stroke, such that a sufficiently similar pen stroke will generate an identical key.
caf
@caf: I understand this is a possibility to solve the problem but I think it's a bad idea. If you translate a pen stroke (which contains a lot of information) to something that contains much less information, only to make it possible that the receiver can easily generate a similar pen stroke, your scheme is very susceptible to eavesdropping because it becomes very easy to generate similar a decryption key simply by drawing a somewhat similar pen stroke. By definition you loose a lot of information that can identify the original creator of the pen stroke.
Ronald Wildenberg
That's possible, but it's by no means certain. The crucial element is estimating how much entropy remains after the "fuzziness" algorithm has been applied. It's an active area of research.
caf
That's true, the more entropy remains, the better-protected the message will be. Can you give some pointers to this research?
Ronald Wildenberg
I wont be sending anything over a wire. What I think I need is some way of translating similar stroke(s) into the same key every time. At least that would be the most straightforward approach. Regular text could be OCR:ed as suggested by MusiGenesis but I want to use any stroke, like any random symbol, and to convert that into something meaningful I would need a library of all possible symbols, which is not possible, unless I parse the symbol into meaningful bits such as small fragments with different kind of bending. But the smaller they get the harder it gets to draw the same symbol.
Andreas Zita
I dont want to store the key to compare with because that would be a security risk even locally(?). I would be nice if the decryption algorithm somehow could accept a range of similar keys by design. Any key in a certain range is acceptable. I cant compare the entered key with the key used for encryption because I dont want to store the key anywhere.
Andreas Zita
@Andreas - There's definitely a security risk when you store the keys with a trusted third party. I'm not sure my answer is the best possible solution but at least it's doable. As suggested by caf, there's still a lot of research to 'fuzzy identity-based encryption' (search http://scholar.google.com for this term). There's no ready-made .NET solution yet for this type of encryption.
Ronald Wildenberg
A: 

As I pointed out in a comment, I don't know that fuzzy encryption is a mature topic, so let's think laterally. Since you are doing the encrypt/decrypt locally, how about a hardware implementation? Ie you buy a tablet computer to do the biometrics, and stronghold a conventional encryption key inside it. It would go like this:

  1. Tablet computer is set in kiosk mode, that is, it has basic hardening to prevent the UI from switching applications. It is also protected from physical theft and tampering (eg it is stored in a vault)
  2. User plugs in a USB key containing the encrypted file.
  3. Tablet computer detects which drawing to expect from a metadata field (for instance from the X509 certificate used for encryption, if you don't mind about the overengineering).
  4. Tablet computer authenticates user through appropriate biometric magic.
  5. Tablet computer decrypts file and offers to store it or browse it on-screen.

Note that step 5 is the weak point of the security chain, as attacks against document viewers (eg Acrobat Reader) are becoming more prevalent. Better use some kind of sandboxing scheme such as a VMWare instance.

DomQ
+2  A: 

OK. Let's break your problem into two.

1) Fuzzy 2) Encryption

Reality is both these concepts are relatively old and their implementation have been out there for years. Each deals with the problem at hand very well but this does not mean that combining these two is a good idea. I believe you have to have your solution as a two-stage approach.

First of all encryption standards out there are great in securing the data using a SINGLE EXACT key. In your case you need symmetric encryption algos such as AES or Rijndael.

Fuzzy part of the solution is also not that hard. Like any other fuzzy recognition technique, you need to do a feature extraction and create a vector to be passed to the encryption algo. You need to build fuzziness into your features. For example, number of strokes, quadrant of the start point for each stroke, a factor of curviness for each stroke and the like. This will be enough to build a 32 bit vector to pass to the encryption algorithm.

UPDATE

I will try to make it more illustrative:

2 bits for number of strokes: 1, 2, 3, +3 which translates to 00, 01, 10 and 11

2 bits for quadrant of the start of the first stroke: TopLeft, TopRight, BottomLeft, BottomRightt encodes to 00, 01, 10 and 11

2 bits for quadrant of the end of the first stroke: ditto

2 bits for quadrant of the start of the second stroke: ditto. If no second stroke then 00.

2 bits for quadrant of the end of the second stroke: ditto. If no second stroke then 00.

2 bits for quadrant of the start of the third stroke: ditto. If no third stroke then 00.

2 bits for quadrant of the end of the second stroke: ditto. If no third stroke then 00.

2 bits for curviness of the first stroke: straight->00 ... Nice round->11. This is not going to be very easy and you might reduce the degrees of curviness to 2 and use just one bit but it is a "suck it and see".

So this is 16 bits. You can leave the rest as zero for now and try and see how it works.

Hope this helps.

Aliostad
Yes thats probably the method I will have to use. It may be hard to design this alg so that the final key only has visually similar shapes as input, and not also some very different shapes that just happens to match the algorithm in some unforseen way. It would be a lot cooler if the encryption-alg could "record" accepted deviations in the key and accept similar keys by design, but that just me dreaming... Thanks everyone for you input!
Andreas Zita
@Andreas Zita. I hope you realize that a 32 bit key is just a toy. If you want a real key (e.g. 128 bits) then you either have to use an unrealistically large number of strokes or use a finer grid. When using a finer grid, it becomes more likely that the user will make small errors. Correcting such errors is not easy but possible with schemes already proposed in the literature. Hence I hope, you are going to used a proposed scheme instead of inventing your own.
Accipitridae
A: 

One simple way to do it is to instead encode lots of information about the key, and then figure out how accurately it matches. Example information, assuming you can convert the user's input to some approximation of lines/arcs/points:

Number of straight lines
Number of enclosed regions
Amount of drawing used.

Standard deviation of points on the x axis.
Standard deviation of points on the y axis.
Standard deviation of size of shapes drawn.

Sorted list of areas of the enclosed regions.

Etc.

Thus, instead of representing the image you are representing properties of the image. The fuzziness will be that you give each of these properties an accuracy score, add or multiply or otherwise combine the total accuracy level, and accept the password if it passes some threshold.

Brian
If I had a key to compare with I could do this. I have tested a method of normalizing two strokes to compare and then check the angle deviation of each segment. If all segments has a deviation less then ~45 deg then its a ok match. But now I dont want to store the key since its a huge security risk, so I wont have anything to compare with. An other option is if a can do this shape analyzing to create an exact representation for each accepted variation, similar to regular OCR-recognition, but then I would have to have a table of all possible shapes or at least all possible parts/segments...
Andreas Zita
@Andreas: You are not storing the key. You are storing these specific statistics. When I say standard deviation of points, I mean internally within the key.
Brian