views:

1132

answers:

3

Hi, i need a preg_match() syntax or something similar to extract JPG or PNG or GIF URLs from a mixed text and put them in an array or at last store the first url.

maybe some syntax which searchs for strings that are beginning with http and ending with jpg/png/gif..

i believe it can be done with preg_match()

Note: the text can be like that : blablablabla"http://www.xxx.com/xxx.jpg"blablablabla

Thank you

+4  A: 
$matches = array();
preg_match_all('!http://.+\.(?:jpe?g|png|gif)!Ui' , $string , $matches);
OcuS
I think you need to make it non-greedy: '!http://.+?\.(?:jpe?g|png|gif)!i' (notice the question mark after the +) otherwise it will match between the first http:// until the last matching extension.
jeroen
Hi, thanks so much, it gives an empty array , any ideas ? :)
David
@jeroen: your right about the greedy thing. I fixed it.@Tom: any example of $string that gives an empty array ? (I'd like to fix what has to be fixed !)
OcuS
that's an example : blablablabla"http://www.xxx.com/xxx.jpg"blablablabla
David
Work perfectly for me this...
OcuS
Okidoki, I´ve deleted my answer as yours should do it.
jeroen
+1 if you provide the link to the `preg_match_all` documentation.;)
Felix Kling
http://php.net/preg_match_all :)
OcuS
+5  A: 

Please note the special occasions where they can fool your server inserting fake matches.

For example:

http://www.myserver.com/virus.exe?fakeParam=.jpg

Or

http://www.myserver.com/virus.exe#fakeParam=.jpg

I've modified quickly the regex to avoid this cases, but i'm pretty sure there could be more (like inserting %00 in the path of the file, for example, and cannot be easily parsed by a regex)

$matches = array();
preg_match_all('!http://[^?#]+\.(?:jpe?g|png|gif)!Ui' , $string , $matches);

So, for security, use always regex in the most restrictive way, for example, if you know the server, write it into the regex, or if you know that the path always will include letters, hyphens, dots, slashes and numbers, use one expression like:

$matches = array();
preg_match_all('!http://[a-z0-9\-\.\/]+\.(?:jpe?g|png|gif)!Ui' , $string , $matches);

This should avoid any funny surprise in the future.

Pablo López Torres
thanks, i always check for mimes after getting the picture url :D
David
The mimetype is a header you can modify. If you want to try this install a plugin for firefox called TamperData. You can see in petitions the value for that header and that it could be changed. This is a good practice, but you need also to assure the final filename ends with .jpg or the right extension.
Pablo López Torres
Can´t you just use exif_imagetype() to check for a valid image?
jeroen
I'm afraid, but not. You can take a random PNG file and append to the end of the file: <?php phpinfo(); ?>And then run the script: <?php print_r(exif_imagetype('mymaliciousfile.png')); ?>and you will see that it returns 3, the equivalent for: IMAGETYPE_PNGIf you verify that the extension of that file you're storing into your system is PNG then you're more or less safe. But if not, they could insert a backdoor in your server if that file is accessible from the webserver.
Pablo López Torres
A: 

Its Really Very nice code but one thing if when URL is https or ftp or ftps then that case how to find the image URL from String Like $string = "The text you want to filter goes here. https://www.vindesh.com/expert.gif may i call you later coz i am going to wory";

Please any one could you code for this case.

Vindesh Mohariya

Vindesh Mohariya