views:

81

answers:

1

Hi,

I wonder if someone can help me - I've been programming VB.Net for a long time but have rarely had to do much threading in ASP.Net.

I'm attempting to take "screenshots" of websites using an in-memory browser. These images are then logged in a DB and written to the local file system.

When I run it on my local server, it all works fine. When I run it in a shared hosting environment, it's all fine up until I do a thread.join at which point either the target thread terminates immediately or gets stuck (no further logging info is received from either thread). I've attached the log below

The crucial code is also attached but in short it does:

For each url, start a new thread and thread.join to it. The new thread will load the browser and begin navigation. it will then noop until the browser load has completed before returning the bitmap image generated (next step).

On browser load completion, an event fires. The handler captures the bitmap image from the browser and writes it to a local.

I've done some googling and can't find a lot of related information - I have found common shared hosting problems and have made sure I've got them covered (eg allowing partially trusted callers, signing assemblies, etc...)

I'd appreciate it if anyone with knowledge on this topic would be kind enough to point me in the right direction.

Many thanks

NB: I'm aware that at present it's going to be very slow as it's processing images sequentially - But until I can get it to work on one thread, I have no chance of getting it working on multiple threads.

This is largely mangled together from code samples and I haven't even begun to tidy it up / organise it better so apologies for the slightly messy code.

Public Function GetWebsiteImage(ByVal URL As String, Optional ByVal BrowserWidth As Integer = 1280, Optional ByVal BrowserHeight As Integer = 1024) As Bitmap
    LogIt(String.Format("Webshot {1}: {0}", "Getting Image", id))
    _URL = URL
    _BrowserHeight = BrowserHeight
    _BrowserWidth = BrowserWidth

    Dim T As Thread
    T = New Thread(New ThreadStart(AddressOf GenerateImage))

    T.SetApartmentState(ApartmentState.STA)
    'T.IsBackground = True
    LogIt(String.Format("Webshot {1}: {0}", "Starting Thread", id))
    T.Start()

    '*** THIS IS THE LAST LOG ENTRY I SEE ***
    LogIt(String.Format("Webshot {1}: {0}", "Joining Thread", id))
    T.Join()

    Return _Bitmap
End Function

Friend Sub GenerateImage()
    LogIt(String.Format("Webshot {1}: {0}", "Instantiating Web Browser", id))
    Dim _WebBrowser As New WebBrowser()
    _WebBrowser.ScrollBarsEnabled = False
    LogIt(String.Format("Webshot {1}: {0}", "Navigating", id))
    _WebBrowser.Navigate(_URL)
    AddHandler _WebBrowser.DocumentCompleted, AddressOf WebBrowser_DocumentCompleted
    'AddHandler _WebBrowser.
    While _WebBrowser.ReadyState <> WebBrowserReadyState.Complete
        Application.DoEvents()
    End While
    LogIt(String.Format("Webshot {1}: {0}", "Disposing", id))
    _WebBrowser.Dispose()
End Sub

Private Sub WebBrowser_DocumentCompleted(ByVal sender As Object, ByVal e As WebBrowserDocumentCompletedEventArgs)
    LogIt(String.Format("Webshot {1}: {0}", "Document load complete", id))
    Dim _WebBrowser As WebBrowser = DirectCast(sender, WebBrowser)
    _WebBrowser.ClientSize = New Size(Me._BrowserWidth, Me._BrowserHeight)
    _WebBrowser.ScrollBarsEnabled = False
    _Bitmap = New Bitmap(_WebBrowser.Bounds.Width, _WebBrowser.Bounds.Height)
    _WebBrowser.BringToFront()
    _WebBrowser.DrawToBitmap(_Bitmap, _WebBrowser.Bounds)
    _PageTitle = _WebBrowser.DocumentTitle
    LogIt(String.Format("Webshot {1}: {0}", "About to capture bitmap", id))
    _Bitmap = DirectCast(_Bitmap.GetThumbnailImage(_BrowserWidth, _BrowserHeight, Nothing, IntPtr.Zero), Bitmap)
    LogIt(String.Format("Webshot {1}: {0}", "Bitmap captured", id))
End Sub

and the log entries I see:

2010 01 19 02:21:01 > Starting Process
2010 01 19 02:21:01 > Capture 229 Processing: http://www.obfuscated.com/
2010 01 19 02:21:01 > Capture 229 Found capture db record
2010 01 19 02:21:01 > Webshot f7710f41-cac0-4ed1-93df-020620257c91: Instantiated
2010 01 19 02:21:01 > Capture 229 Requesting image
2010 01 19 02:21:01 > Webshot f7710f41-cac0-4ed1-93df-020620257c91: Getting Image
2010 01 19 02:21:01 > Webshot f7710f41-cac0-4ed1-93df-020620257c91: Starting Thread
2010 01 19 02:21:01 > Webshot f7710f41-cac0-4ed1-93df-020620257c91: Joining Thread
+1  A: 

When you are running it on your local server, do you mean the ASP.NET personal web server or a local installation of IIS? The former is not even comparable to IIS because it runs as an interactive Windows application whereas with the latter you'll be running as a service which can have no UI and the behavior of threads is governed strictly by IIS.

You could try setting aspcompat="true" on the Page directive, but more likely than not, the hosting company has configured IIS worker process pinging which will terminate threads that are unresponsive for a defined period of time.

The bottom line is that the WebBrowser control (and the SHDocVw ActiveX control that it wraps) is not designed to work in a non-interactive service process and you're likely in for an uphill climb trying to make it work. Unfortunately I don't know of any safer alternatives however.

Josh Einstein
You're right that I was referring to the in-IDE website host. I'm sure it's possible as I've seen it done but unfortunately I can't remember the exact method (I do remember it was a pig).Thanks for the suggestion re appcompat - I'll give it a try. I'm away from my PC until next week so apologies if I take a while to respond again.
Basiclife