views:

51

answers:

4

I am trying to create an app that allows multiple search requests to occur whilst maintaining use of the user interface to allow interaction. For the multiple search requests, I initially only had one search request running with user interaction still capable by using a backgroundworker to do this. Now I need to extend these functions by allowing more search functions and basically just queueing them up. I am not sure whether to use multiple backgroundworkers or use the threadpool since I want to be able to know the progress of each search job at any time.

If I was using the threadpool all I would do is add this in the loop which gets called each time a search request is made

 ThreadPool.QueueUserWorkItem(AddressOf Search)

but if use backgroundworkers this is the only way I know how and I am not how to add these to anything but perhaps an arraylist and I can call reportprogress from each bgw.

edit:

so for example this is my current code

 For Each Thread In ThreadList
                    'Thread.Sleep(500)
                    SyncLock Me

                        If searchChoice = "google" Or fromUrl.Contains("google") Then
                            links = parsingUtilities.GetGoogleLinksFromHtml(fromUrl, html, searchItem)
                            posts = parsingUtilities.GetPostLinksFromHtml(links)
                            If links.Count = 0 Then
                                Exit Sub
                            End If
                            Exit For
           .....

so in the above code links and posts are arraylists I use to get the urls i need and they are used for different searchchoices and I initially had the synclock on the links and posts but someone else told me to use the synclock me instead. So from your point I should I assign a separate data control for each search control and after sufficient time lock the appropriate one and transfer it to write it. thanks

+1  A: 

I generally leave the threadpool alone. It's a process wide resource that can be configured to have different sizes, especially in webapps. However, since your application sounds like a client side form application, you won't be sharing a threadpool with any other apps and you can configure it to your needs.

Your use case fits the threadpool better than most but there's not a huge advantage to doing so.

marr75
Thanks; my app will later be migrated to a web app so do you think then it would be better to deal with the threadpool. A little more info into my app: each backgroundworker / thread from threadpool will perform a long search function that involves in itself creating more threads to crawl webpages and the main should ideally just be sitting on the user interface.
vbNewbie
Still depends, background workers let the system manage your threading which is safe from an operating system standpoint but risky from a "I expect certain performance characteristics from my app" perspective. To implement this functionality in a web app, you'd probably use individual asynchronous requests for each search, so the threadpool decision won't probably affect this much. As a compromise, you could manage your threads in some kind of collection then check the number of threads you've added so far before adding a new one.
marr75
+1  A: 

You can still get progress updates with BackgroundWorkers, so I don't see a reason to stop using them.

BackgroundWorker itself uses the thread pool to recycle threads AFAIK. So you could probably still limit the size of the pool and keep using BWs.

Assaf Lavie
+1  A: 

You can use the ThreadPool in the same way you use BackgroundWorker.

The only difference is that, with the ThreadPool, you'll need to use Dispatcher.Invoke or Control.Invoke to marshal your progress and completion events back onto the UI thread yourself. However, the ThreadPool lets you easily queue up and run as many tasks as you wish.

Reed Copsey
do you have any links to where threadpool can be manipulated like this to control the threads. All I need is check on each threads progress and display this if necessary and perhaps stop all threads in case of a problem.
vbNewbie
+1  A: 

I think that you should use the thread pool, because from what you've written you will have even more threads being spawned by the background threads.

In designs like this I have seen the system overloaded by thousands of threads. A thread pool is easier to manage because you have one place to set limits on threads. Some jobs may need to wait before they can get their work done but this is better than overloading the entire system.

Update:
I didn't know this, but it seems BackgroundWorker uses the ThreadPool so you are not in danger of an exploding number of threads. The system I saw exploding with thousands of threads was written in C++.

Zan Lynx
thank you sir...I have no problem using the threadpool to do the multi searches. How would I report the progress of each thread to the UI so that they know how far each job is?
vbNewbie
Thank you sir, one more question if you dont mind...within each search operation run by one background thread, I can either create individual threads or use the threadpool to perform the tasks within. Now each of these threads will have a bunch of urls they add to an arraylist which is later dumped to a text file. I use synclock now a lot since it avoids contents being changed but seems to be slow. Is there a better way to accomplish this? Perhaps a data structure that is thread safe where data can just be added by all threads and then i only need to lock the text file when I flush.
vbNewbie
@vbNewbie: The usual way to do scalable data collection is to have each thread lock and store into its own data structure (array for you) and to have another thread lock and read each thread's array for writing. The reading thread should *only* do this at the end or on a timer such as once every few seconds.
Zan Lynx
please see edit in post.
vbNewbie