tags:

views:

44

answers:

3

How can I create a regular expression to search strings with a given pattern? For example I want to search all strings that match pattern '*index.tx?'. Now this should find strings with values index.txt,mainindex.txt and somethingindex.txp.

Pattern pattern = Pattern.compile("*.html");
Matcher m = pattern.matcher("input.html");

This code is obviously not working.

+3  A: 

You need to learn regular expression syntax. It is not the same as using wildcards. Try this:

Pattern pattern = Pattern.compile("^.*index\\.tx.$");

There is a lot of information about regular expressions here. You may find the program RegexBuddy useful while you are learning regular expressions.

Mark Byers
+1  A: 

* matches zero or more occurrences of the preceding token, so if you want to match zero or more of any character, use .* instead (. matches any char).

Modified regex should look something like this:

Pattern pattern = Pattern.compile("^.*\\.html$");
  • ^ matches the start of the string
  • .* matches zero or more of any char
  • \\. matches the dot char (if not escaped it would match any char)
  • $ matches the end of the string
krock
A: 

The code you posted does not work because:

  1. dot . is a special regex character. It means one instance of any character.
  2. * means any number of occurrences of the preceding character.

therefore, .* means any number of occurrences of any character.

so you would need something like

 Pattern pattern = Pattern.compile(".*\\.html.*");

the reason for the \\ is because we want to insert dot, although it is a special regex sign. this means: match a string in which at first there are any number of wild characters, followed by a dot, followed by html, followed by anything.

Amir Rachum