tags:

views:

174

answers:

3

here is a regex i got from: a blog i can't link to because i am new... just google amazon short url and click on the blog post by noah coad

as you can see from this page... it is supposed to extract the unique product id from any amazon url so you can shorten it... or use it to pull info from amazon apis.

here is the sample code i am trying to use to get it to work:

<?php
$example_url = 'http://www.amazon.com/dp/1430219483/?tag=codinghorror-20';    

$reg = '(?:http://(?:www\.){0,1}amazon\.com(?:/.*){0,1}(?:/dp/|/gp/product/))(.*?)(?:/.*|$)';

echo 'test<br/>';

echo preg_match($reg,$example_url);
?>

and here is my output:

test

Warning: preg_match() [function.preg-match]: Unknown modifier '(' in /Users/apple/Sites/amazon/asin_extract.php on line 14

thanks so much! this is my first time posting on this site where i have found countless answers already

on second hand... take back some of my thanks for this painful first time submission process... i had to trim this question since it thinks my regex patterns are urls

+2  A: 

Your regex probably needs delimiters : a character that will be present at the beginning and the end of it.
This comment on the PHP manual is interested, about this :-)

'/' is often used ; but some people prefer '#' -- the second one being nice for URLs

So :

$reg = '#(?:http://(?:www\.){0,1}amazon\.com(?:/.*){0,1}(?:/dp/|/gp/product/))(.*?)(?:/.*|$)#';

And, with the full code, a bit modified to capture the results :

$example_url = 'http://www.amazon.com/Professional-Visual-Studio-System-Programmer/dp/0764584367/ref=sr_1_1/104-4732806-7470339?ie=UTF8&amp;s=books&amp;qid=1179873697&amp;sr=8-1';
$reg = '#(?:http://(?:www\.){0,1}amazon\.com(?:/.*){0,1}(?:/dp/|/gp/product/))(.*?)(?:/.*|$)#';
echo 'test<br/>';

$matches = array();
echo preg_match($reg,$example_url, $matches);

var_dump($matches);

The output you get from the var_dump is :

array
  0 => string 'http://www.amazon.com/Professional-Visual-Studio-System-Programmer/dp/0764584367/ref=sr_1_1/104-4732806-7470339?ie=UTF8&amp;s=books&amp;qid=1179873697&amp;sr=8-1' (length=149)
  1 => string '0764584367' (length=10)

And $matches[1] is 0764584367.

Pascal MARTIN
thanks for an awesome, simple and elegant explanation. and even more so for going above and beyond. this is a great community and you are a shining example of this.
jkatzer
You're welcome :-) Have fun !
Pascal MARTIN
A: 

Looks like the problem is that it's trying to use parenthesis as your begin/end regular expression delimiter. Here's a sample from the man page:

$pattern = '/^def/';

If you use slash as your begin/end expression delimiter it'll be rough to write your regular expression. I suggest using the pound sign ('#') for regular expression as you'll have to escape less characters.

Here's what I ended up with:

<?php

$example_url = 'http://www.amazon.com/Server-Side-Programming-Techniques-Performance-Scalability/dp/0201704293';

$reg = "#(?:http://(?:www\.){0,1}amazon\.com(?:/.*){0,1}(?:/dp/|/gp/product/))(.*?)(?:/.*|$)#";

echo 'test<br/>';

echo preg_match($reg, $example_url);

?>
Epsilon Prime
thanks for an awesome answer as well.. and feel free to be included on the comment i wrote above for the first answer.
jkatzer
A: 

Some of the links on Amazon has URI only, not the complete URL. In such cases the above regex pattern is not working

Regex is working for the URLs like this

$example_url = 'http://www.amazon.com/Server-Side-Programming-Techniques-Performance-Scalability/dp/0201704293';

But for the example like this

$example_url = '/Server-Side-Programming-Techniques-Performance-Scalability/dp/0201704293';

it is not working. Can someone help me please.

Thanks

Chris