views:

29

answers:

2

Background

I use the jQuery URL parser plugin by Mark Perkins for extracting query string values from the current URL.

The parsing process fails when query string values contain the '@' character, most notably when there is an email address in the query string. This is in reference to the latest version of the plugin, taken from the github project page today.

Working and non-working examples

The parsing process populates the internal parsed.queryKey object with key:value pairs from the query string.

Two modes are offered: 'loose' and 'strict'. Both return the same result.

// Parse URL that works
jQuery.url.setUrl("http://example.com/?email=example.example.com");

// Examine result
parsed.queryKey = {
    'email':'example.example.com'
}


// Parse URL that fails
jQuery.url.setUrl("http://example.com/[email protected]");

// Examine result
parsed.queryKey = {
}

Problem

I'd like to be able to modify one (or both) regular expressions to overcome the issue of the parsing of query string arguments failing when there is an '@' present.

The parser uses regular expressions to extract information from the URL. These are defined on (what is currently) line 27:

parser: {
    strict: /^(?:([^:\/?#]+):)?(?:\/\/((?:(([^:@]*):?([^:@]*))?@)?([^:\/?#]*)(?::(\d*))?))?((((?:[^?#\/]*\/)*)([^?#]*))(?:\?([^#]*))?(?:#(.*))?)/, //less intuitive, more accurate to the specs
    loose: /^(?:(?![^:@]+:[^:@\/]*@)([^:\/?#.]+):)?(?:\/\/)?((?:(([^:@]*):?([^:@]*))?@)?([^:\/?#]*)(?::(\d*))?)(((\/(?:[^?#](?![^?#\/]*\.[^?#\/.]+(?:[?#]|$)))*\/?)?([^?#\/]*))(?:\?([^#]*))?(?:#(.*))?)/ // more intuitive, fails on relative paths and deviates from specs
}

I don't sufficiently understand the workings of these regular expressions to be able to make the required modifications.

How can I modify the regular expressions to allow the parsing process to work when the is an '@' present in the query string?

A: 

Update:

Using Regex Coach I stepped through and can make this suggestive expression:

^(?:(?![^:@]+:[^:@\/]*@)([^:\/?#.]+):)?(?:\/\/)?((?:(([^:]*):?([^:@]*))?@)?([^:\/?#]*)(?::(\d*))?)(((\/(?:[^?#](?![^?#\/]*\.[^?#\/.]+(?:[?#]|$)))*\/?)?([^?#\/]*))(?:\?([^#]*))?(?:#(.*))?)

another attempt:

^(?:(?![^:@]+:[^:@\/]*@)([^:\/?#.]+):)?(?:\/\/)?((?:(([^:]*):?([^:]*))?)?([^:\/?#]*)(?::(\d*))?)(((\/(?:[^?#](?![^?#\/]*\.[^?#\/.]+(?:[?#]|$)))*\/?)?([^?#\/]*))(?:\?([^#]*))?(?:#(.*))?)

Maybe this RegEx can be of use to you:

(?<protocol>(http|ftp|https|ftps):\/\/)?(?<site>[\w\-_\.]+\.(?<tld>([0-9]{1,3})|([a-zA-Z]{2,3})|(aero|arpa|asia|coop|info|jobs|mobi|museum|name|travel))+(?<port>:[0-9]+)?\/?)((?<resource>[\w\-\.,@^%:/~\+#]*[\w\-\@^%/~\+#])(?<queryString>(\?[a-zA-Z0-9\[\]\-\._+%\$#\~',/]*=[a-zA-Z0-9\[\]\-\._+%\$#\~',/]*)+(&[a-zA-Z0-9\[\]\-\._+%\$#\~',/]*=[a-zA-Z0-9\[\]\-\._+%\$#\~',/]*)*)?)?
Brad
Perhaps this is a useful regex for parsing URLs, but is it a drop-in replacement that works in the context of the given jQuery plugin?
Jon Cram
@Jon, I doubt it's a drop-in replacement and I haven't used the plug-in you're referring to. I just wanted to drop you a suggestion that might be helpful even if it does take a bit more work.
Brad
@Jon, try the updates I made to my answer. I believe I edited the "loose" expression.
Brad
@Brad: Thanks for the suggestions. I'll give them a run through and see what they do!
Jon Cram
+1  A: 

Use encodeURIComponent

var url = "http://example.com/?email=";
var email = encodeURIComponent("[email protected]");
jQuery.url.setUrl(url + email);

This will replace @ with %40.

enjoy!

gilly3
Nice suggestion, but I can't guarantee correctly-encoded query string values (users can mess with the URL!). Secondly, validly encoding an @ still fails.
Jon Cram
I don't follow. Can you explain what you are trying to do? Maybe you are needing `decodeURIComponent` instead?
gilly3