tags:

views:

59

answers:

3

Hi all,

The following regex works when used in string.replaceall() but not in case of string.replaceFirst().

String:

TEST|X||Y|Z||

Expected output:

TEST|X|**STR**|Y|Z||

Regex:

  string.replaceAll( "(TEST\\|[\\|\\|]*\\\\|)\\|\\|", "$1|ST|" );


Output (not desired):


 TEST|X|**STR**|Y|Z|**STR**|


string.replaceFirst( "(TEST\\|[\\|\\|]*\\\\|)\\|\\|", "$1|ST|" );

No changes are made to the string.

Please help!

Thanks in advance.

A: 

Your question is not very clear, But I assume you are asking why there is a difference in the output.There are two matches found for the regex pattern you passed in the string. So when you said replaceAll both the matches were replaced and when replaceFirst is used only the first one is replaced. Hence the difference in the output. To find the matches -

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Regex {

    public static void main(String[] args) {

        String string1 = new String("TEST|X||Y|Z||");           

        Pattern pattern = Pattern.compile("(TEST\\|[\\|\\|]*\\\\|)\\|\\|");
        Matcher matcher = pattern.matcher(string1);

        boolean found = false;
        while (matcher.find()) {
            System.out.printf("I found the text \"%s\" starting at "
                    + "index %d and ending at index %d.%n", matcher.group(),
                    matcher.start(), matcher.end());
            found = true;
        }
        if (!found) {
            System.out.printf("No match found.%n");
        }
    }
}
johnbk
Consider the final two "||". How is it that they are matched when they are preceded by Z? I don't see anything in the regexp which matches.
djna
@djna - I didn't see either..till I actually ran the code.
johnbk
I think his problem is that he's got an accidental OR
djna
A: 

If you only want to replace the first "||" by "|ST|", you can do this :

System.out.println("TEST|X||Y|Z||".replaceFirst("\\|\\|", "|ST|"));
Colin Hebert
A: 

Your regexp is probably not doing what you expect. The reason being that the pipe symbol | has two meanings. It's your seprator and it's also an OR in the regexp.

(TEST\\|[\\|\\|]*\\\\|)\\|\\|

You are effectively searching for TEST etc OR || and are matching both ||s

If youa re trying to match only the || after TEST|X| you could use

"(TEST\\|[^\\|]*)\\|\\|"

TEST followed by pipe, followed zero or more non-pipes, followed by two pipes

djna