Threads Question

I have tried to implement a multithreaded crawler and it seems to be working in fetching a list or urls concurrently without any issues. I tested each step and had the program write all html pulled to a text file. Now the rest of the program intends to take each html stored as a string and parse it for a list of urls from that page and then write this list to a database. This is where the errors start: First I have locked out the parsing process since it first caused errors by returning empty lists with the error ' property evaluation faled' Now I have lists being returned but I cannot write this to a database.

My question is, do I need to lock out everything and why? Can I not allow all threads to parse at the same time and each write to an arraylist? Will this all hinder performance?

Here is a sample of some of my code; first the call to go and parse a url:

If Not String.IsNullOrEmpty(html) Then
            'get all links first

            links = parser.GetLinks(fromUrl, html)

then to write to a database:

For Each link As String In links


          recordsAffected = _
                    Links_DBObj.insert_feedurls_link(link, feedlink, execError, connObj_Generic, commObj_Generic)

Thanks I will try that.

vbNewbie 2010-02-04 19:28:55

Quick question, is using synclock at all effective, since I found that even when using it there is still a clash of threads trying to access it. So should I create a new queues at each function where I need file access?

vbNewbie 2010-02-04 20:13:51

Everyone should access the same queue, I'd think:Process 1: Instantiate Queue, spin up crawlers and processors.Process 2 - n: Crawl, acquire content, Process, Enqueue.Process n+1: Check Queue, Dequeue, Save.

Jacob G 2010-02-04 20:37:31

ansaurus

tags:

views:

answers:

related questions