tags:

views:

165

answers:

5
Regualar expression: <img[^>]+src\s*=\s*['"]([^'"]+)['"][^>]*>

This works fine when 'src' is in lowercase and manages both single and double quotes. I would like this expression to return matches for following test data

1. <html><img src ="kk.gif" alt="text"/></html>
2. <html><img Src ="kk.gif" alt="text"/></html>
3. <html><img sRC ="kk.gif" alt="text"/></html> (any charcter in 'src' can be uppercase/lowercase)
4. <html><img SRC ="kk.gif" alt="text"/></html>
5. <html><img src ='kk.gif' alt="text"/></html>
+6  A: 

Create the pattern with the CASE_INSENSITIVE flag. See Pattern.compile(String, int). This will affect the entire string, which means the img also.

Or the cheap way, change src to [Ss][Rr][Cc]. This will just affect the src portion.

lavinio
Thanks; just curious; do you mean efficient/faster when you say "cheap way" here?
Krishna Kumar
I meant it was the lowest-effort way.
lavinio
Compile it once and reuse. more efficient/faster. :)
Ratnesh Maurya
A: 

Have a look here

You have to set case insensitivity in the pattern constructor.

Peter
+2  A: 

It seems to me that if you want to process HTML, the best way to go is to use a real HTML parser.

Although I am not familiar with Java, there seems to be quite a few to choose from: Open Source HTML Parsers in Java.

This will allow you to deal with cases like an other attribute being before the src and including the character '>', which is valid HTML, or the src attribute including a quote, and probably a few other unlikely but possible trickeries.

mirod
A: 

Off the top of my head:
You might replace the src with [S|s][R|r][C|c] if you only want case insensitivity applied to src.

KT

Karl T.
Without the actual |'s, of course. :)
lavinio
+1  A: 

You can set the expression to case insesitve using "(?i)"

Regular expression: (?i)<img[^>]+src\s*=\s*['"]([^'"]+)['"][^>]*>

for just a part of the expression, use "(?i:part)"

Regular expression: <img[^>]+(?i:src)\s*=\s*['"]([^'"]+)['"][^>]*>

or just do it using the second argument of Pattern.compile

Pattern p = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);

[]]

Carlos Heuberger