tags:

views:

103

answers:

6

I am having some trouble with this regular expression, Can somebody maybe assist me with the regex...

I want to match the following in the source of websites that have this line installed on there pages:

The code should always match this exact match (It is a constant):

<img src="http://www.domain.com/test.asp" width="1" height="1" />



htstring.match(/\<img src\=""http:\/\/www.domain.com\/test.asp"" width=""1"" height=""1"" \/>/ig);

My problem seems to be escaping the " in the regex

Any help would be appreciated!

Thanks

A: 

If you're using .NET, you can escape the string:

var matchMe = "<img src=\"http://www.domain.com/test.asp\" width=\"1\" height=\"1\" />";
var pattern = Regex.Escape(matchMe);

It doesn't look like you're using .NET though. I don't think you have to escape quotes like that. In fact, in your pattern, the only characters I know you should escape are the period . and forward slash /.

Drew Noakes
+2  A: 

You don't need to escape them.

But you do need to escape the periods(.). With a backslash.

Andy Hume
Thanks Andy - You were 100% correct - The problem were the periods(.) Thanks a million!
Gerald Ferreira
A: 

You should only need to escape forward slashes and periods.

myRegex = /<img src="http:\/\/www\.domain\.com\/test\.asp" width="1" height="1" \/>/
Stefan Kendall
Thanks Stefan your solution worked 100%
Gerald Ferreira
+1  A: 

Exactly how a regexp behaves depends on which engine your language is using. Not all regexp engines are the same.

That said, it appears that you are escaping what should be the end of the matching regexp :

\/>/ig

should probably be

/>/ig

Also, you may not want to use double quotes, e.g. =""htt should be ="htt

There are regular expression testers available on the internet, one being at http://www.regular-expressions.info/javascriptexample.html

Tim
+1  A: 

Hi Gerald,

If the string is a constant, you don't need to use a regex. I don't see anything in your regex that is "regexy" - eg, there is nothing but the constant string so just using a string would be easiest.

Also, what programming language are you using? From the syntax, I guessed it was Ruby - but that's only a guess, so the syntax below may not work for you.

htstring.match('<img src="http://www.domain.com/test.asp" width="1" height="1" />')
ckeh
+1  A: 

Regex's are useful when you're trying to match variations. For example, if you tag was constant except for the domain in the "src" element or the whitespace. Stefan and Andy are exactly correct, but the (working) regex you now have is still no different than the string literal in my answer above.

So both the regex and the string are equivalent, and both match:

'<img src="http://www.domain.com/test.asp" width="1" height="1" />'.match(/<img src="http:\/\/www\.domain\.com\/test\.asp" width="1" height="1" \/>/)
=> #<MatchData:0x5ebbf90>

vs

'<img src="http://www.domain.com/test.asp" width="1" height="1" />'.match('<img src="http://www.domain.com/test\.asp" width="1" height="1" />')
=> #<MatchData:0x5eb6cac>

If you want to match subtle variations (for example, the whitespace isn't always exactly one space, sometimes it's 1 space, sometimes 2, others 3, etc.) then you need a regex, not a string, but the current regex won't match either because it's just doing an exact match (because it's not using any regex stuff at all - it might as well be a string). Eg, 2 spaces after "img":

'<img  src="http://www.domain.com/test.asp" width="1" height="1" />'.match(/<img src="http:\/\/www\.domain\.com\/test\.asp" width="1" height="1" \/>/)
=> nil

But a regex actually using power of regex with special regex characters will match - note the "\s+" after "img", which will match 1..n whitespace characters:

 '<img  src="http://www.domain.com/test.asp" width="1" height="1" />'.match(/<img\s+src="http:\/\/www\.domain\.com\/test\.asp" width="1" height="1" \/>/)
=> #<MatchData:0x5e94fbc>

Also, I might not have been explicit enough last time, but it's pretty important that you specify what language you're working in. Like Tim pointed out, regex can vary between lanuages so an answer could be correct but not work for you depending on whether you're both using Ruby or C# or Java or whatever.

ckeh
Just noticed your comment that you're using JavaScript.
ckeh
Note: my examples are still in Ruby, but the basics apply to both regex, though JS may use a different syntax for whitespace - or it may be the same! :)
ckeh