views:

55

answers:

0

Hey Guys,

i have a Problem building a Web Spider in PHP, which is able to crawl hundreds of websites. I tried several approaches...one with the snoopy browser class, one with Simple HTML DOM Parser and one with the sfWebBrowserPlugin for Symfony. I run into the same problem with all aproaches. My crawler crawls a site in 3 stages...category, detail, website-redirect....and if i try to crawl too much categories, it just fails, without error and i get a file to download which contains the html for the 200 OK site. if i only try to crawl 1 or 2 categories, everything works flawless. It is not due to some exceptions during the crawling process, since i tried all the categories alone, and it is working fine. Im thankfull for every hint...i think it could be due to some oop design failures of me, but i dont thinkt that php fails, because i initiate to much objects. My other idea is some apache configuration issue, but i dont know very much about apache. :( Thanks in advance for the help.

Sorry, i cant link the classes, since i can only post 1 link.

edit: i found the error...it was missing utf 8 encoding in the template.