views:

136

answers:

2

Hi

I'm keeping my self busy working on app that gets a feed from twitter search API, then need to extract all the URLs from each status in the feed, and finally since lots of the URLs are shortened I'm checking the response header of each URL to get the real URL it leads to. for a feed of 100 entries this process can be more then a minute long!! (still working local on my pc) i'm initiating Curl resource one time per feed and keep it open until I'm finished all the URL expansions though this helped a bit i'm still warry that i'l be in trouble when going live

any ideas how to speed things up?

+1  A: 

You may be able to get significantly increased performance by making your application multithreaded. Multi-threading is not supported directly by PHP per se, but you may be able to launch several PHP processes, each working on a concurrent processing job.

Asaph
Thanks Asaph :)
Yaniv
+2  A: 

The issue is, as Asaph points out, that you're doing this in a single-threaded process, so all of the network latency is being serialized.

Does this all have to happen inside an http request, or can you queue URLs somewhere, and have some background process chew through them?

If you can do the latter, that's the way to go.

If you must do the former, you can do the same sort of thing.

Either way, you want to look at way to chew through the requests in parallel. You could write a command-line PHP script that forks to accomplish this, though you might be better off looking into writing such a beast in language that supports threading, such as ruby or python.

timdev
tim -thank you for the detailed answer since i don't know ruby or python i will have to find a php sulotion i guess a queue is the best option though it might slow the "real time" i was looking for :)
Yaniv
You can certainly make it happen with a command-line script that forks several children to run requests in parallel. You can probably figure out a nice way to get updates from those processes so you can give feedback to the user by having some AJAX-driven bit on your web-bound script that polls back to the server.
timdev