views:

98

answers:

5

Hi all i have an array shown below

Array
(
    [0] => http://api.tweetmeme.com/imagebutton.gif?url=http://mashable.com/2010/09/25/trailmeme/ 
    [1] => http://cdn.mashable.com/wp-content/plugins/wp-digg-this/i/gbuzz-feed.png 
    [2] => http://mashable.com/wp-content/plugins/wp-digg-this/i/fb.jpg 
    [3] => http://mashable.com/wp-content/plugins/wp-digg-this/i/diggme.png 
    [4] => http://ec.mashable.com/wp-content/uploads/2009/01/bizspark2.gif 
    [5] => http://cdn.mashable.com/wp-content/uploads/2010/09/web.png 
    [6] => http://mashable.com/wp-content/uploads/2010/09/Screen-shot-2010-09-24-at-10.51.26-PM.png 
    [7] => http://cdn.mashable.com/wp-content/uploads/2009/02/bizspark.jpg 
    [8] => http://feedads.g.doubleclick.net/~at/lxx00QTjYBaYojpnpnTa6MXUmh4/0/di 
    [9] => 
    [10] => http://feedads.g.doubleclick.net/~at/lxx00QTjYBaYojpnpnTa6MXUmh4/1/di 
    [11] => 
    [12] => http://feeds.feedburner.com/~ff/Mashable?i=0N_mvMwPHYk:j5Pmi_N-JQ8:D7DqB2pKExk 
    [13] => 
    [14] => http://feeds.feedburner.com/~ff/Mashable?i=0N_mvMwPHYk:j5Pmi_N-JQ8:V_sGLiPBpWU 
    [15] => 
    [16] => http://feeds.feedburner.com/~ff/Mashable?i=0N_mvMwPHYk:j5Pmi_N-JQ8:F7zBnMyn0Lo 
    [17] => 
    [18] => http://feeds.feedburner.com/~ff/Mashable?d=qj6IDK7rITs 
    [19] => 
    [20] => http://feeds.feedburner.com/~ff/Mashable?d=_e0tkf89iUM 
    [21] => 
    [22] => http://feeds.feedburner.com/~ff/Mashable?i=0N_mvMwPHYk:j5Pmi_N-JQ8:gIN9vFwOqvQ 
    [23] => 
    [24] => http://feeds.feedburner.com/~ff/Mashable?d=yIl2AUoC8zA 
    [25] => 
    [26] => http://feeds.feedburner.com/~ff/Mashable?d=P0ZAIrC63Ok 
    [27] => 
    [28] => http://feeds.feedburner.com/~ff/Mashable?d=I9og5sOYxJI 
    [29] => 
    [30] => http://feeds.feedburner.com/~ff/Mashable?d=CC-BsrAYo0A 
    [31] => 
    [32] => http://feeds.feedburner.com/~ff/Mashable?i=0N_mvMwPHYk:j5Pmi_N-JQ8:_cyp7NeR2Rw 
    [33] => 
    [34] => http://feeds.feedburner.com/~r/Mashable/~4/0N_mvMwPHYk
)

basically, i want to

  1. remove every empty array element
  2. remove every array item without extensions ".jpg,.png,.gif" in its name;
  3. finally remove array items containing keywords such as "digg,fb,tweet,bizspark".
+1  A: 
foreach ($array as $key => $value) {
    if (
        empty($value)||
        (preg_match('#^http:\/\/(.*)\.(gif|png|jpg)$#i', $value) == 0)||
        (preg_match('#(tweet|bizspark)#i', $value) > 0)
    ) {
        unset($array[$key]);
    }
}
kgb
+3  A: 

Using, e.g., array_filter() will give you flexibility and ease of maintenance (changing requirements, de-bugging, etc.):

function url_array_filter($url)
{
    static $words = array('digg', 'fb', 'tweet', 'bizspark');
    static $extens = array('.jpg', '.png', '.gif');
    $ret = true;
    if (!$url) {
        $ret = false;
    } elseif (str_replace($words, '', $url) != $url) {
        $ret = false;
    } else {
        $path = parse_url($url, PHP_URL_PATH);
        if (in_array(substr($path, -4), $extens)) {
            $ret = false;
        }
    }
    return $ret;
}

$arr = array_filter($arr, 'url_array_filter');
print_r($arr);

(Works for the array given, but may need changes; it's demo code.)

GZipp
Changing substr($path, -4) to strrchr($path, '.') will get rid of the integer constant.
GZipp
A: 

Hi @GZipp, have tried ur code and it returns eg hi, ive tried the above code... it returns an array containing the stuff i want out.

hi, ive tried the above code... it returns an array containing the stuff i want out. )

Array ( [5] =>
http://feedads.g.doubleclick.net/~at/W-z_kHMi30EtE1mpxK8NvMmNmeg/0/di
[7] =>
http://feedads.g.doubleclick.net/~at/W-z_kHMi30EtE1mpxK8NvMmNmeg/1/di
[9] =>
http://feeds.feedburner.com/~ff/Mashable?i=mEedXAp78pg:339cIishd6A:D7DqB2pKExk
[11] =>
http://feeds.feedburner.com/~ff/Mashable?i=mEedXAp78pg:339cIishd6A:V_sGLiPBpWU
[13] =>
http://feeds.feedburner.com/~ff/Mashable?i=mEedXAp78pg:339cIishd6A:F7zBnMyn0Lo
[15] =>
http://feeds.feedburner.com/~ff/Mashable?d=qj6IDK7rITs
[17] =>
http://feeds.feedburner.com/~ff/Mashable?d=_e0tkf89iUM
[19] =>
http://feeds.feedburner.com/~ff/Mashable?i=mEedXAp78pg:339cIishd6A:gIN9vFwOqvQ
[21] =>
http://feeds.feedburner.com/~ff/Mashable?d=yIl2AUoC8zA
[23] =>
http://feeds.feedburner.com/~ff/Mashable?d=P0ZAIrC63Ok
[25] =>
http://feeds.feedburner.com/~ff/Mashable?d=I9og5sOYxJI
[27] =>
http://feeds.feedburner.com/~ff/Mashable?d=CC-BsrAYo0A
[29] =>
http://feeds.feedburner.com/~ff/Mashable?i=mEedXAp78pg:339cIishd6A:_cyp7NeR2Rw
[31] =>
http://feeds.feedburner.com/~r/Mashable/~4/mEedXAp78pg
))

)

i would like it to return eg from first example

[5] => http://cdn.mashable.com/wp-content/uploads/2010/09/web.png 
    [6] => http://mashable.com/wp-content/uploads/2010/09/Screen-shot-2010-09-24-at-10.51.26-PM.png 

any ideas?

Sir Lojik
i dnt want anything from feedburner
Sir Lojik
Please use the formatting controls when posting code or output.
BoltClock
What have you tried so far to modify the code given to you?
Nick Presta
@Sir Lojik - To be frank, I'd think it would be obvious what change you would need to make. Hint: one simple change to the "$words" array.
GZipp
A: 

Hi GZIp i have modified the code and im getting better results

function url_array_filter($url)
{
    static $words = array('digg', 'fb', 'tweet', 'bizspark','feedburner','feedads','CountImage');
    static $extens = array('.jpg', '.png', '.gif');
    $ret = true;
    if (!$url) {
        $ret = false;
    } elseif (str_replace($words, '', $url) != $url) {
        $ret = false;
    } else {
        $path = parse_url($url, PHP_URL_PATH);
        if (in_array(substr($path, -4), $extens)) {
            $ret = false;
        }
    }
    return $ret;
} 

my problem now comes with the output. eg

Array ( [0] => http://cdn.dzone.com/images/thumbs/120x90/491551.jpg' style='width:120;height:90;float:left;vertical-align:top;border:1px solid ) 

Array ( [0] => http://cdn.dzone.com/images/thumbs/120x90/490913.jpg' style='width:120;height:90;float:left;vertical-align:top;border:1px solid ) 

i want the url only. i think i have the problem with extracting urls from original content. lemme post a link to the origial question and what im doing.

http://stackoverflow.com/questions/3793768/rss-feeds-and-image-extraction-indepth

i simply want the url. i think from that link.... getImagesUrl() maybe messing up. im going to try and use parse_url to bring back the correct url. lemme know if im on right track. im very close to manage pulling image urls from rss feeds parsed with magpie

Sir Lojik
A: 

Ok GZip, this is the modification and addition ive added to ur code... 95% works!! great. although i do receive some funny results im posting below

function url_array_filter($url)
{
    static $words = array('digg', 'fb', 'tweet', 'bizspark','feedburner','feedads','CountImage','fuelbrand');
    static $extens = array('.jpg', '.png', '.gif');
    $ret = true;
    if (!$url) {
        $ret = false;
    } elseif (str_replace($words, '', $url) != $url) {
        $ret = false;
    } else {
        $path = parse_url($url, PHP_URL_PATH);
        if (in_array(substr($path, -4), $extens)) {
            $ret = false;
        }
    }
    return $ret;
} 

function cleanURL($a_url)
    {
    $ret=array();
    foreach ($a_url as $c)
        {
        $a=parse_url($c, PHP_URL_SCHEME).'://'.parse_url($c, PHP_URL_HOST).parse_url($c, PHP_URL_PATH);    
        $a=explode("'",$a);
        $ret[]=$a[0];
        }
    return $ret;         
    }

example usage. $this->getImagesUrl($c); below returns results in first question.

                    foreach($content as $c) {
                        // get the images in content
                        $arr = $this->getImagesUrl($c);
                        $arr = array_filter($arr, 'url_array_filter');
                        }
                    $ret=cleanURL($arr);
                    if (count($ret)>0)
                        {
                        print_r($ret);                                
                        echo "<br/><br/>";
                        }

up to this point almost everything works great but i keep getting some bad results like

Array ( [0] => http://cdn.mashable.com/wp-content/uploads/2010/02/ipad-side- )
Array ( [0] => http://mrg.bz/FZtr2k [1] => http://mrg.bz/IDkx4w ) 

people we almost there... any ideas

Sir Lojik