tags:

views:

1259

answers:

4

I'm writing a Java app that is accepting URL parameter values that may or may not be encoded. I need an easy way to tell whether or not I need to encode the parameter string.

In other words, I want a function boolean needsEncoding(String param), which will return true if I pass in the String "[email protected]", and false if I pass in "foo%40test.com". The problem with this idea is that this is ambiguous. How would I know whether or not the "%" sign in the latter string should be encoded? One way to handle this is to modify my contract - require clients to pass in un-encoded strings so that I know I always need to encode them. Thoughts?

+5  A: 

I thought I'd put this as a proposed answer so that people can vote:

One way to handle this is to modify my contract - require clients to pass in un-encoded strings so that I know I always need to encode them.

Julie
I voted your answer up. Changing contract makes sense
anjanb
It is always best not to guess.
Rontologist
There is no way to detect if the user want foo%40test.com or [email protected] if he set foo%40test.com. Change the contract.
Horcrux7
A: 

Signs a string has been URL Encoded:

  1. There are no spaces, but a lot of plus symbols.
  2. All percentage signs are followed by two digits.
  3. There are no characters outside of a..b, A..B, 0..9, ".", "", "-", "", "%" and "+" in it.

However I think that changing the contract is the recommended action here.

JeeBee
Thanks for the input. Looking at all the ugly (and probably unreliable) code I'd have to write to figure this out, I think you're right that it's better to change the contract!
Julie
A: 

how about decoding the string and checking if all the differences between the original and decoded string are valid url entities.

Sijin
That might work. Could you flush it out a little more? If I decoded "test%40geek.com", I would get "[email protected]" - how would I compare the two? Do you have a code snippet in mind?
Julie
A simple n^2 diff algorithm should work: Use two pointers into the strings, compare them, and if they match, iterate both, else iterate the one in the longer string, and save (to a buffer) the char. that pointer just pointed to. When they match again save your buffer as a difference if it isnt ==""
rcreswick
Sounds like a dynamic programming approach to the problem. So, if I pass in "1+2" and determine that "+" was decoded to " ", what next? I still don't know if this is a pre-encoded string, or if I should really encode "+" to "%2B". I think the problem is intractable.
Julie
A: 

You could use java.net.URLDecoder on the input and see if it changes by comparing the input and output String values. Looking at the Javadocs for URLDecoder, it describes the business logic it applies on an input String to determine whether or not it requires URL decoding.

If you MUST get a boolean result and do not want to incur the overhead of attempted decoding to get that boolean result, you can always crack open the source code of the URLDecoder class and use the same business logic it uses to determine if URL decoding is necessary.

shadit
Looking at the javadoc, if I passed in "1+2" to my boolean method, URLDecoder would return "1 2". I still can't tell if the client means that the string "1+2" should be encoded, or if it's just "1 2" that has already been encoded.
Julie