ansaurus

Question

How do I replace a string that starts with "[id", has an unknown middle part and ends in "]"?

Answer 1

+3 A:

The following will remove all occurences of [idsomething]. something will match all characters except for ].

$newText = preg_replace('#\[id[^\]]+\]#', '', $subject);

If you know that something is always a digit, you could use something like this:

$newText = preg_replace('#\[id\d+\]#', '', $subject);

For more information about regular expressions, see this website: http://www.regular-expressions.info/.

Lekensteyn 2010-09-30 15:00:22

Answer 2

+3 A:

The string replace functions can only work on specific strings. If you have a pattern you want to match, you should use preg_replace, which replaces based on regular expressions:

$text = preg_replace('/\[id[^\]]*\]/', $replacement, $text)
// $replacement is whatever string you want to replace with

/\[id[^\]]*\]/ is a regular expression (aka regex). The slashes on each end are delimiters which PHP requires to delineate a regex. The rest of the pattern can be described as follows:

\[     # match a literal [ character
id     # match the string "id"
[^     # open a negated character class
  \]   # match anything other than literal ] character (since it's in a negated class)
]*     # close the class, repeat it zero or more times
\]     # match a literal ]

Concepts:

Character classes - a character class is a way of describing that a character can be one of a series of possibilities. Character classes start with a [ and end with a ]. For example, [abc] matches a or b or c. Character classes can be negated if the first character within a class is ^: [^abc] matches any character that isn't a or b or c. In our pattern, [^\]] matches any character that isn't ]. Note that the ] within the class has to be escaped because ] generally means the end of the class but we want to specify a literal ] character.
Repetition using * - Parts of patterns can be repeated (which allows for a pattern to specify that something can appear multiple times). There are three repetition operators: ? specifies that something may appear zero or one times (ie. it makes part of your pattern optional); * specifies that something may appear zero or more times (ie. it can be optional, but it could also any number of times); + specifies something that must appear at least once.
In our case; [^\]]* specifies that a character that is not ] can be matched zero or more times - this will match an empty string, or will match abcdefg, as the negated character class matches 7 times (as each character is not ]).
Note that by default, regexes are greedy, which means that they will match as much of the string as possible; for this reason [^\]]* when matched against abcdefg will match the entire string, as that is the largest match it can make (even though smaller substrings match the pattern).
Everything else in this pattern matches literally. As we saw above, [ and ] need to be escaped to match the literal characters - because they have meaning within a regex (ie. to define a character class) - but id matches an i followed immediately by a d.

When you put that all together, you end up with a pattern that matches a opening bracket, followed by the letters id, followed by zero or more characters and then a closing bracket.

Note if you wanted to make this pattern case-insensitive, you could add an i after the final slash: /\[id[^\]]*\]/i. /i is a modifier which makes the entire pattern case insensitive (so it'd match [ID=...] as well).

I recommend reading through the tutorial on regular-expressions.info if you are not familiar with regexes, as it will give you a very good understanding of what they do and how to compose them.

Daniel Vandersluis 2010-09-30 15:05:28

Answer 3

+1 A:

using preg_replace():

<?php

    $text = "[hi=hello] [id=hellomynameisjoe] [hello=hi]";
    $new = preg_replace('@\[id[^\]]+\]@', '[replaced!]', $text);
    echo $new;
?>

Ruel 2010-09-30 15:07:52

I like this syntax. Very readable.

Atømix 2010-09-30 15:12:37

Note that lazy patterns are less efficient as they cause a lot of backtracking. See http://blog.stevenlevithan.com/archives/greedy-lazy-performance

Daniel Vandersluis 2010-09-30 15:21:38

I wonder why I'm so used to that. Thanks for the info, editing my code.

Ruel 2010-09-30 15:23:03

Edited. thanks.

Ruel 2010-09-30 15:27:25

ansaurus

tags:

views:

answers:

How do I replace a string that starts with "[id", has an unknown middle part and ends in "]"?

related questions