tags:

views:

29

answers:

2

ok i have some string

'Hello^<php>World&*124><
i ju*st press enteR'

how do i return it to ( a function is better ? )

'Hello World123
i just press enter'

allow

  • numbers
  • text
  • spaces , newline , etc

how do i do that with a regex? do i have to use regex? is there a another way ?

Thanks

Adam Ramadhan

+4  A: 

You can do this:

function removeBad($str)
{
  return preg_replace("/[^a-zA-Z0-9_ (\n|\r\n)]+/", "", $str);
}

This will remove anything other than alphabet, numbers, space and newline

If you also want to remove any tags such as <php> in your text, you could do:

function removeBad($str)
{
  $str = strip_tags($str);
  return preg_replace("/[^a-zA-Z0-9_ (\n|\r\n)]+/", "", $str);
}

Usage:

$str = removeBad('Hello^<php>World&*124><');
echo $str;

Result:

HelloWorld124

.

$str = removeBad('i ju*st press EnteR');
echo $str;

Result:

i just press EnteR
Sarfraz
great, how about limiting everything ( includeing special chars ) ? `/.{$min,$max}/` or is there a better way ?
Adam Ramadhan
@Adam Ramadhan: Would you explain a little more what do you mean exactly?
Sarfraz
limiting like 'wadwd' , heres my newbie try hope it help `function setMax( $input , $min , $max ) { $text = "/^.{'.$min.','.$max.'}$/"; if ( preg_match( $text, $input ) ) { return true; } else { return false; } }`
Adam Ramadhan
`removeBad('i ju*st press EnteR')` would be `i just press EnteR`, the case of the E and R wouldn't change...
Daniel Vandersluis
how about the setMax thing? anyway thanks really sAc.
Adam Ramadhan
Note that only `-`, `\ ` and `]` have a special meaning inside character classes. That means `[^a-zA-Z0-9_ (\n|\r\n)]` will also allow `(`, `|`, and `)`.
Gumbo
@Gumbo: Right you say, so it should be something like `[^a-zA-Z0-9_ \\(\n\\|\r\n\\)]`, right?
Sarfraz
thanks master gumbo, but still i dont know the diffrence betwen (this|that) [this|that]
Adam Ramadhan
@sAc no, you can't have alternation in a character class. `(?!\r?\n)[^a-zA-Z0-9_ ]` should work.
Daniel Vandersluis
@Daniel Vandersluis: Ok that's useful, thanks :)
Sarfraz
@Adam Ramadhan: `(this|that)` is an alternation that matches either `this` or `that`. `[this|that]` on the other hand is a character class that maches one single character of the characters described in `[…]`, in this case either `t`, `h`, `i`, `s`, `|`, or `a`.
Gumbo
@sAc: An alternative to using a look-ahead assertion would be to use `(?:[^a-zA-Z0-9_ \r\n]+|\r([^\n]))` and replace it with the match of the first group.
Gumbo
A: 

Regex substitution can do this for you. I think that you need two. The first one to remove everything between the < and > characters. The second one to remove any character that is NOT in your allowed character set. That is the safest way to do it.

Zan Lynx