tags:

views:

320

answers:

7

Hello all,

I'm trying to check if a web site is up and running. I'm currently doing this with UserAgent library in Perl with timeout 1. However, it is still too slow for me.

I call the script every five minutes from cron. There are lots of links to check and the script takes more than five minutes to complete execution. So, I need a more efficient way to do this. It can even be a solution in C.

+4  A: 

curl -I http://hostname

First line will contain 503 or 404 if service not avaliable or page not found.

time yields this for curl -I http://www.google.com

real    0m0.125s
user    0m0.004s
sys     0m0.004s

and this for curl -I http://www.google.cmo

real    0m0.120s
user    0m0.004s
sys  0m0.004s
Rich Bradshaw
Actually I tried with wget it worked very fast for existing websites but had to wait for broken links. But I'll try the curl too.
systemsfault
+4  A: 

If there are lots of links, I suggest you make the program multi-threaded or fork() it a few times. That way, you can expect speed improvements.

Alan Haggai Alavi
Unfortunately my perl don't compiled with multi-threaded, but I'll check multiprocess option
systemsfault
You can use threads even without having a threaded perl. `use forks;` - http://search.cpan.org/perldoc?forks
Alan Haggai Alavi
Oh ok, this knowledge will help for other things too. Thanx Alan. But will it work when writing to a file?
systemsfault
No problem, holydiver. Yes, it should work when writing to a file.
Alan Haggai Alavi
+4  A: 

How about using httping?

Grzegorz Oledzki
+3  A: 

Fetching resources from the network usually involves quite a bit of latency.

As Alan Haggai Alavi suggested, you will probably want to divide the work onto several parallel threads/processes. The documentation for the Parallel::ForkManager module even has an example that you should be able to build upon.

hillu
Parallel::ForkManager is nice.
Alan Haggai Alavi
Yeah hillu, but I think I can't use parallelization option because I'm checking if a website is up write the url of a specific png file into a file. So if i parallelize the program won't it be a problem when writing to a file?
systemsfault
As I understand hillu's suggestion you should have multiple processes/threads, but one site should be monitored by one process (at most). So there should be no conflicts between processes.
Grzegorz Oledzki
Why don't you make one thread that has controll over the file and has an event that gets triggered by the other threads who are checking the sites? In the triggering they could mention what site, the latency, response etc.
borisCallens
+7  A: 

It is slow most probably because you're doing it sequentially.

Consider using LWP::Parallel::UserAgent - it will run many requests at the same time.

depesz
+7  A: 

Following ways to accelerate it:

  1. Just check if we can set up a socket to 80 port of target server, do not really send a Get http request, or just send a simple HEAD request.
  2. Use multi-thread to make it faster.
arsane
Would this cover all the desired bases? What if there's a server error on the main page or something?
borisCallens
Going multi thread, while looks nice, is unnecessary overkill. Usually using asynchronous I/O is sufficient.
depesz
A: 

I don't know a whole lot of C (BLASPHEMY!) nor Perl, but how I see it I would try the following:

  • One thread to do the file writing. This thread would have a que where it could write it's commands in.
  • One thread per site you want to check. The thread would use whatever method suits you most of the other answers and then report to the main thread through an event it can trigger.

2cts

borisCallens