views:

35

answers:

1

Good day,

I have a simple MySQL database with 1 table and 3 fields

Table: LINKS
Fields: ID URL STATUS

The table has about 3 millions links.
I would like to check all the URLs and post their returned status in the status field so that I can remove the dead links later.

This would probably require a shell script because it will need to run for a long time.
I think CURL headers may provide the best method for checking the status code, but I don't know how to put this all together. Any help on the above or a suggestion for a better way to handle this would be greatly appreciated.

Thank you.

A: 

I would rather do this in batches by, say, thousand and instead doing this in bash, I'd do it in PHP or Perl (or any other scripting language of your choice, e.g. Python).

PHP has fopen which would do the job of CURL so you don't have to spawn a separate syscall for each link check. MySQL connectivity is almost native in both PHP and Perl, too.

mindas
Thank you for your suggestion mindas, I'll try see if I could figure this on my own, but if you do have any code for this, or could write it quick t would be much appreciated. Thanks
Yallaa