views:

230

answers:

2

I want to match a sequence of Rapidshare links on a webpage. The links look like:

http://rapidshare.com/files/326251387/file_name.rar

I wrote this code:

if(preg_match_all('/http:\/\/\rapidshare\.com\/files\/.*?\/.*?/', $links[1], $links))
{
    echo 'Found links.';
} else {
    die('Cannot find links :(');
}

And it retuns Cannot find links :( every time. Please note that I want to return the entire match, so it will bring back every Rapidshare link found on the page in an array.

$links[1] does have a valid string, too.

Any help will be appreciated, cheers.

+1  A: 

Looks like you have a stray backslash before rapidshare

if(preg_match_all('/http:\/\/\rapidshare\.com\/files\/.*?\/.*?/', $links[1], $links))

Should be

if(preg_match_all('/http:\/\/rapidshare\.com\/files\/.*?\/[^\s"']+/', $links[1], $links))

(\r is a carriage return character)

mopoke
Thanks, stupid typos :) Just one thing though, now it retusn an array with links like: `http://rapidshare.com/files/328807106/`, eg not the filename too?
Matt
That only returns the first character, but thanks. Got it working :)
Matt
Updated it to match the filenames at the end. It'll stop when it sees whitespace, a " or ' character (assuming you're trying to pull data out of something like an href attribute...
mopoke
A: 

To avoid that madness you're getting into escaping slashes in URLs, I would use another delimiter for my regex -- like # for instance ; and this would help seeing that you have one too many \ before rapideshare.


Then, you could have something that looks like this :
(Inspired from yours -- only changed a bit at the end because it wasn't returning the file's name ;; you might want to adapt this a bit more, though, to exlclude some other characters than just white-spaces, like ")

$str = 'blah http://rapidshare.com/files/326251387/file_name.rar blah';
if(preg_match_all('#http://rapidshare\.com/files/(.*?)/([^\s]+)#', $str, $m)) {
    var_dump($m);
}


Which, here, will get you :

array
  0 => 
    array
      0 => string 'http://rapidshare.com/files/326251387/file_name.rar' (length=51)
  1 => 
    array
      0 => string '326251387' (length=9)
  2 => 
    array
      0 => string 'file_name.rar' (length=13)
Pascal MARTIN