ansaurus

Question

Answer 1

A:

Replace all occurrences of * in the pattern with [^ ]* - it matches a sequence of zero or more non-space characters.

Thus http://*google.com/* will become http://[^ ]*google.com/[^ ]*

Here is a regular expression to do the task:

regex = urlPattern.replace(/\*/g, "[^ ]*");

Amarghosh 2010-06-25 10:27:57

Answer 2

+1 A:

Generating a regex is probably the right way, but is gets more complicated than simply replacing the asterisks.

For example, your pattern http://*google.com/* should not match http://www.malicioushacker.org/1337/google.com/maps.

Sjoerd 2010-06-25 10:30:39

Answer 3

A:

If you want to see a well tested library for extracting parts of a URI, I would check out Google Closure Library's goog.uri.utils methods.

http://closure-library.googlecode.com/svn/docs/closure_goog_uri_utils.js.source.html#line220

Here's the regex that does the heavy lifting:

goog.uri.utils.splitRe_ = new RegExp(
    '^' +
    '(?:' +
      '([^:/?#.]+)' +                     // scheme - ignore special characters
                                          // used by other URL parts such as :,
                                          // ?, /, #, and .
    ':)?' +
    '(?://' +
      '(?:([^/?#]*)@)?' +                 // userInfo
      '([\\w\\d\\-\\u0100-\\uffff.%]*)' + // domain - restrict to letters,
                                          // digits, dashes, dots, percent
                                          // escapes, and unicode characters.
      '(?::([0-9]+))?' +                  // port
    ')?' +
    '([^?#]+)?' +                         // path
    '(?:\\?([^#]*))?' +                   // query
    '(?:#(.*))?' +                        // fragment
    '$');

Alex M. 2010-06-25 10:32:56

ansaurus

tags:

views:

answers:

Matching URL with wildcards

related questions