views:

634

answers:

1

I have a string in my QTP test project. In some cases, this string is a plaintext E-mail's content; in other cases it's HTML. In both cases, I need to strip all URLs from the string to match it against an Expected case.

How can this be done in QTP/VBScript?

A: 

This should do the trick, though your URLs will need to begin with http:// or https:// for them to be picked up:

Dim text
text = "<your text with URLs here>"

Dim rgx
Set rgx = New RegExp
rgx.IgnoreCase = True
rgx.Global = True
rgx.Pattern = "([A-Za-z]{3,9})://([-;:&=\+\$,\w]+@{1})?([-A-Za-z0-9\.]+)+:?(\d+)?((/[-\+~%/\.\w]+)?\??([-\+=&;%@\.\w]+)?#?([\w]+)?)?"

Dim match, matches
Set matches = rgx.Execute(text)
For Each match in matches
  MsgBox match.Value, 0, "Found Match"
Next

The regex pattern for matching URLs comes from Chris Freyer's blog and seems to handle most types of URL you're likely to encounter. It worked well on the tests I performed with it.

Xiaofu