views:

734

answers:

4

I've got a string:

$string = "Hello World!";

I want to turn it into a URL friendly tag, and I've developed a function to do it:

function stripJunk($string){
    $string = str_replace(" ", "-", $string);
    $string = preg_replace("/[^a-zA-Z]\s/", "", $string);
    $string = strtolower($string);
    return $string;
}

However, when I run my $string through it above, I get the following:

$string = "hello-world!";

It seems that there are characters slipping through my preg_replace, even though from what I understand, they shouldn't be.

It should read like this:

$string = "hello-world";

What's going on here? (This should be easy peasy lemon squeasy!)

Edit 1: I wasn't aware that regular expressions were beginners stuff, but whatever. Additionally, removing the \s in my string does not produce the desired result.

The desired result is:

  1. All spaces are converted to dashes.
  2. All remaining characters that are not A-Z or 0-9 are removed.
  3. The string is then converted to lower case.

Edit 2+: Cleaned up my code just a little.

+3  A: 

The \s at the end of your pattern means that you will only replace non-alphabetical characters which are immediately followed by a whitespace character. You probably want the \s within the square brackets so that whitespace is also preserved and can later be replaced with a dash.

You will need to add 0-9 inside the square brackets if you want to also allow numbers.

For example:

<?php

$string = "Hello World!";

function stripJunk($string){
    $string = preg_replace("/[^a-zA-Z0-9\s]/", "", $string);
    $string = str_replace(" ", "-", $string);
    $string = strtolower($string);
    return $string;
}

echo stripJunk($string);
Tom Haigh
Removing the \s still leaves the exclamation point in the string.
EvilChookie
are you sure? I can't reproduce that
Tom Haigh
Yup. Very sure. The exclamation point is still appearing in the string.
EvilChookie
Your edit ended up removing the exclamation points, but the spaces were removed - the preg_replace and str_replace just need to be swapped. As with SlientGhost's answer, I have absolutely no idea why it wasn't working.
EvilChookie
$string = preg_replace("/[^a-zA-Z\s]/", "", $string);
Elzo Valugi
+1  A: 

You could use some regular expressions in a row to remove the junk:

<?php

function strip_junk ($string) {

  // first, strip whitespace; and replace every non-alphabetic character by a dash
  $string = preg_replace("/[^a-z0-9-]/u", "-", strtolower(trim($string)));

  // second, remove double dashes
  $string = preg_replace("/-+/u", "-", $string);

  // finally, remove leading and trailing dashes
  $string = preg_replace("/^-*|-*$/u", "", $string);

  return $string;

}

?>

This should do the trick, happy PHP'ing!

thijs
A: 

What about this?

preg_replace("/[.\n\r][^a-zA-Z]/", "", $string);

if that does not work:

preg_replace("/[.\n\r^a-zA-Z]/", "", $string);

Does that work?

Time Machine
Neither worked, sorry. The second one went so far as to remove everything BUT the exclamation points!
EvilChookie
And this?/[^a-zA-Z]!/
Time Machine
+3  A: 

The following works just fine to me:

function stripJunk($string){
    $string = str_replace(" ", "-", trim($string));
    $string = preg_replace("/[^a-zA-Z0-9-]/", "", $string);
    $string = strtolower($string);
    return $string;
}
SilentGhost
I'm flabbergastered. I have absolutely no idea what is different versus my original code - the only difference is that you're adding the "-" to the matching string. This worked, but I cannot say why it's working versus the other solutions I tried. Cheers!
EvilChookie
What you were doing: remove **all** non-letters and convert string to lower case. Code before change didn't work because `!` wasn't followed by the space.
SilentGhost