tags:

views:

77

answers:

3

Possible Duplicate:
Java - Regex problem

I have list of URLs of types:

  • http://www.example.com/pk/etc
  • http://www.example.com/pk/etc/
  • http://www.example.com/pk/etc/etc

where etc can be anything.

So I want to search only those URLs that contains www.example.com/pk/etc or www.example.com/pk/etc/.

Note: It is for all those who think that it is a duplicate question -- Kindly read both the questions carefully before marking this question as duplicate. Even after reading you can't understand that both the questions are different, then kindly leave without marking it as duplicate because I can't tell you the diff. in anymore detail

+1  A: 

Your problem isn't fully defined so I can't give you an exact answer but this should be a start you can use:

^[^:]+://[^/]+\.com/pk/[^/]+/?$

The difference is that the / is no longer optional and there must be at least one more character after pk/.

These strings will match:

http://www.example.com/pk/ca
http://www.example.com/pk/ca/
https://www.example.com/pk/ca/

These strings won't match:

http://www.example.com/pk//
http://www.example.co.uk/pk/ca
http://www.example.com/pk
http://www.example.com/pk/
http://www.example.com/anthingcangoeshere/pk
http://www.example.com/pkisnotnecessaryhere
http://www.example.com/pk/ca/sf
Mark Byers
@Mark Yes, you are right that atleast one more character must be there after `pk/`. There should be only one segment `/...` or `/.../` after `pk` and not more than one... so http://www.example.com/pk/ca/sf must not be matched
Yatendra Goel
+1  A: 
String pattern = "http://www.example.com/pk/[^/]+/?$";

I am assuming http://www.example.com/pk// is not accepted. If this should be accepted too, then use

String pattern = "http://www.example.com/pk/[^/]*/?$";
Bytecode Ninja
@Bytecode `+` means "one or more" or "zero or more" ?
Yatendra Goel
@Yatendra: `+` means "one or more"
Martijn Courteaux
@Bytecode... I think tha it will also match http://www.example.com/pk/anything/anything .... If it is true, then it is not acceptable...the url must have only one `/.../` or `/...` segment after `pk`
Yatendra Goel
Then you should add $ after the ?: "http://www.example.com/pk/[^/]+/?$". Please see the updated answer.
Bytecode Ninja
Note that in regular expressions `.` means any character. If you mean a literal dot then you must escape it.
Mark Byers
@Mark Thanks for reminding it... I and (@Bytecode) forgot it...
Yatendra Goel
@Bytecode Can you update your answer?
Yatendra Goel
A: 

So I want to search only those URLs that contains www.example.com/pk/etc or www.example.com/pk/etc/.

Update

I think this will work:

https?://.*\\.?[A-Za-z0-9]+\\.com/pk/etc/?[^.]

But every item in the list you gave contains what you are searching for.

Martijn Courteaux