views:

173

answers:

5

Another problem with str_replace, I would like to change the following $title data into URL by taking the $string between number in the beginning and after dash (-)

  1. Chicago's Public Schools - $10.3M
  2. New Jersey - $3M
  3. Michigan: Public Health - $1M

The desire output is:
chicago-public-school
new-jersey
michigan-public-health

PHP code I am using

$title = ucwords(strtolower(strip_tags(str_replace("1: ","",$title))));
$x=1;
while($x <= 10) {
$title = ucwords(strtolower(strip_tags(str_replace("$x: ","",$title))));
$x++;
}
$link = preg_replace('/[<>()!#?:.$%\^&=+~`*&#233;"\']/', '',$title);
$money = str_replace(" ","-",$link);
$link = explode(" - ",$link);
$link = preg_replace(" (\(.*?\))", "", $link[0]);
$amount = preg_replace(" (\(.*?\))", "", $link[1]);
$code_entities_match = array( '&#39;s' ,'&quot;' ,'!' ,'@' ,'#' ,'$' ,'%' ,'^' ,'&' ,'*' ,'(' ,')' ,'+' ,'{' ,'}' ,'|' ,':' ,'"' ,'<' ,'>' ,'?' ,'[' ,']' ,'' ,';' ,"'" ,',' ,'.' ,'_' ,'/' ,'*' ,'+' ,'~' ,'`' ,'=' ,' ' ,'---' ,'--','--');
$code_entities_replace = array('' ,'-' ,'-' ,'' ,'' ,'' ,'-' ,'-' ,'' ,'' ,'' ,'' ,'' ,'' ,'' ,'-' ,'' ,'' ,'' ,'' ,'' ,'' ,'' ,'' ,'' ,'-' ,'' ,'-' ,'-' ,'' ,'' ,'' ,'' ,'' ,'-' ,'-' ,'-','-');
$link = str_replace($code_entities_match, $code_entities_replace, $link);
$link = strtolower($link);

Unfortunately the result I got:

-chicagoamp9s-public-school
2-new-jersey
3-michigan-public-health

Anyone has a better solution for this? Thanks guys!
(the &#39; changed into amp9 - wonder why?)

A: 

If I understand you correctly:

if (preg_match('!\d+:\s+(.*)\s+-\s+\$\d+(?:\.\d+)?!', $title, $groups)) {
  $words = strip_tags(strtolower($groups[1]));
  $words = preg_replace('\[^\s\w]!', '', $words);
  $words = preg_replace('!\s+!', '-', $words);
  $words = preg_replace('!-+!', '-', $words);
  echo $words;
}

One thing: your text has "1. Chicago's..." not "1: ..." like your code would seem to suggest. Is one an error or is there something else going on?

cletus
can't figure out, got errors..
thanks for reminding me, now the initial code works :p
A: 

You can do:

$str = "1. Chicago's Public Schools - $10.3M";
$from = array('/^\d+\.\s+([^-]*) -.*$/','/[^A-Z ]/i','/\s+/');
$to = array("$1",'','-');
$str = strtolower(preg_replace($from,$to,$str));
echo $str; // prints chicagos-public-schools
codaddict
this works but for number 1, it calls back to the current url
A: 

Assuming you already have extracted titles correctly like "Chicago's Public Schools", then to generate pagenames out of them:

function generatePagename($s) {
    //to lower
    $pagename = trim(html_entity_decode(strtolower($s), ENT_QUOTES));

    //remove 's
    $pagename = trim(preg_replace("/(\'s)/", "", $pagename));

    //replace special chars with spaces
    $pagename = trim(preg_replace("/[^a-z0-9\s]/", " ", $pagename));

    //replace spaces with dashes
    $pagename = trim(preg_replace("/\s+/", "-", $pagename));

    return $pagename;
}

Which will convert something like

Chicago&#39;s "Public": Scho-ols1+23

to

chicago-public-scho-ols1-23.

serg
I made mistake, the input data is not 's but 's where ' equals to ' - thanks anyway
Ok I edited the answer to match that. What you need then is to decode html entities first with `html_entity_decode()`
serg
hmm..i did the shortcut though$link = strtolower(strip_tags(str_replace("amp9s","",$link)));assuming that no $title will have that kind of value :pthanks anyway
Why do you need such a shortcut though?
serg
A: 
<?php

$lines = array("1. Chicago's Public Schools - $10.3M",
                "2. New Jersey - $3M",
                "3. Michigan: Public Health - $1M"
            );

// remove the number bullets
$lines = preg_replace('/\ - \$\d*\.?\d*M$/', '', $lines);

// remove the trailing dollar amount
$lines = preg_replace('/^\d+\.\ /', '', $lines);

// remove ignore chars 
$ignore_pattern = "/['s|:]/";
$lines = preg_replace($ignore_pattern, '', $lines);

for ($i=0; $i<count($lines); $i++) {
    $lines[$i] = implode('-',explode(' ', strtolower(trim($lines[$i]))));
}

print_r($lines);

and the output:

Array
(
    [0] => chicago-public-school
    [1] => new-jerey
    [2] => michigan-public-health
)

EDIT Start:

<?php

$lines = array("1. Chicago's Public Schools - $10.3M",
                "2. New Jersey - $3M",
                "3. Michigan: Public Health - $1M",
                "4. New York's Starbucks - $2M",
            );

$lines = preg_replace('/\ - \$\d*\.?\d*M$/', '', $lines);

$lines = preg_replace('/^\d+\.\ /', '', $lines);

$ignore_strings = array("'s", ':');
for ($i=0; $i<count($lines); $i++) {
    foreach ($ignore_strings as $s) {
        $lines[$i] = str_replace($ignore_strings, '', $lines[$i]);
    }
}

for ($i=0; $i<count($lines); $i++) {
    $lines[$i] = implode('-',explode(' ', strtolower(trim($lines[$i]))));
}

print_r($lines);

output:

Array
(
    [0] => chicago-public-schools
    [1] => new-jersey
    [2] => michigan-public-health
    [3] => new-york-starbucks
)

Hope it meets your needs. EDIT End.

Zhang Yining
[1] => new-jerey --> missing s in all output, thanks anyway
sorry, hope the modification helps.
Zhang Yining
sorry, i think you haven't read my last editing, the first data is Chicago's not Chicago's, really sorry for the late correction :p, i did a shortcut already, please look up from above :p
A: 

Finally I looked back to initial code and have some fixes:

$title = ucwords(strtolower(strip_tags(str_replace("1. ","",$title))));
$x=1;
while($x <= 10) {
$title = ucwords(strtolower(strip_tags(str_replace("$x. ","",$title))));
$x++;
}
$data = preg_replace('/[<>()!#?:.$%\^&=+~`*&#;"\']/', '',$title);
$urldata = str_replace(" ","-",$data);
$data = explode(" - ",$data);
$link = preg_replace(" (\(.*?\))", "", $data[0]);
$budget = preg_replace(" (\(.*?\))", "", $data[1]);
$code_entities_match = array( '&quot;' ,'!' ,'@' ,'#' ,'$' ,'%' ,'^' ,'&' ,'*' ,'(' ,')' ,'+' ,'{' ,'}' ,'|' ,':' ,'"' ,'<' ,'>' ,'?' ,'[' ,']' ,'' ,';' ,"'" ,',' ,'.' ,'_' ,'/' ,'*' ,'+' ,'~' ,'`' ,'=' ,' ' ,'---' ,'--','--');
$code_entities_replace = array('' ,'-' ,'-' ,'' ,'' ,'' ,'-' ,'-' ,'' ,'' ,'' ,'' ,'' ,'' ,'' ,'-' ,'' ,'' ,'' ,'' ,'' ,'' ,'' ,'' ,'' ,'-' ,'' ,'-' ,'-' ,'' ,'' ,'' ,'' ,'' ,'-' ,'-' ,'-','-');
$link = str_replace($code_entities_match, $code_entities_replace, $link);
$link = strip_tags(str_replace("amp39s","",$link));
$link = strtolower($link);

What a mess, I know, but it works anyway, thanks guys for helping me, especially cletus who found the mistake between 1. and 1: