views:

28

answers:

1

Hi guys, I've just run into a little bit of trouble with some PHP on my latest project. Basically I have a block of text ($text) and I would like to search through that text and return all of the MP3 links. I know it has something to do with regular expressions but I just cannot get it working.

Here's my current code:

    if(preg_match_all(".mp3", $text, $matches, PREG_SET_ORDER)) {

  foreach($matches as $match) {
   echo $match[2];
   echo $text;
        }
    }
+2  A: 

Once again, regex is extremely poor at parsing HTML. Use a proper HTML parser to scrape information out of a web page.

For example, use DOMDocument::loadHTML() to parse the HTML content, then getElementsByTagName('a') to get a list of links in the page. For each link, getAttribute('href') to see where it points.

Note however that there is absolutely no guarantee that MP3 files will always and only be stored under filenames ending in .mp3. On the web, the type of a resource does not have to come from a file extension. The only way to find out for sure what type of file a URL points to is to go ahead and fetch it (with an HTTP GET or HEAD request).

bobince
Thanks a lot for your response. The only problem is that I am trying to extract the links from a string, not a web page -- can I still use an HTML parser?
Shola
Yes, that's what `loadHTML()` does. It doesn't care whether the string was fetched from the network or not.
bobince