views:

166

answers:

2

I'm working with a rather large .net web application.

Users want to be able to export reports to PDF. Since the reports are based on aggregation of many layers of data, the best way to get an accurate snapshot is to actually take a snapshot of the UI. I can take the html of the UI and parse that to a PDF file.

Since the UI may take up to 30 seconds to load but the results never change, I wand to cache a pdf as soon as item gets saved in a background thread.

My main concern with this method is that if I go through the UI, I have to worry about timeouts. While background threads and the like can last as long as they want, aspx pages only last so long until they are terminated.

I have two ideas how to take care of this. The first idea is to create an aspx page that loads the UI, overrides render, and stores the rendered data to the database. A background thread would make a WebRequest to that page internally and then grab the results from the database. This obviously has to take security into consideration and also needs to worry about timeouts if the UI takes too long to generate.

The other idea is to create a page object and populate it manually in code, call the relevant methods by hand, and then grab the data from that. The problems with that method, aside from having no idea how to do it,is that I'm afraid I may forget to call a method or something may not work correctly because it's not actually associated with a real session or webserver.

What is the best way to simulate the UI of a page in a background thread?

+2  A: 

If the "the best way to get an accurate snapshot is to actually take a snapshot of the UI" is actually true, then you need to refactor your code.

Build a data provider that provides your aggregated data to both the UI and the PDF generator. Layer your system.

Then, when it's time to build the PDFs, you have only a single location to call, and no hacky UI interception/multiple-thread issues to deal with.

Randolpho
+1, although it may not be a feasible solution at this time, it's at least something to work towards.
Jon Seigel
We went down this iteration the first time. It turned into a maintenance nightmare. Ultimately, somebody would create special logic in the UI, even something as simple as "if the item is from France show the table in a different way, color this cell green now, change the spelling here" it wouldn't reflect in the export because both ascx and pdf outputs needed to match, and ultimately things would get out of sync. This especially goes for people who just want to edit an ascx page and don't even think about exporting because it's not their job.
diadem
You're going to go through Herculean efforts by screenscraping a PDF because your developers lack the necessary discipline? I feel for ya; that's not an ideal situation.
Randolpho
Hmm... Perhaps you should consider adapting the Model-View-ViewModel pattern. Basically add another layer (the ViewModel) for your "custom UI logic" and put those decisions there. Then have both your ASPX and PDF renderers bind to that model and logic.
Randolpho
I still don't follow how I can create a view model that will replace the need for custom code in the ascx pages - I mean, let's say I change a stylesheet? How can i have that cascade to every PDF? What about ascx repeaters and the like? What about javascript that manipulates the DOM?
diadem
+5  A: 

I know of 3 possible solutions:

IHttpHandler

This question has the full answer. The general jiste is you capture the Response.Filter output by implementing your own readable stream and a custom IHttpHandler.

This doesn't let you capture a page's output remotely however, it only allows you to capture the HTML that would be sent to the client beforehand, and the page has to be called. So if you use a separate page for PDF generation something will have to call that.

WebClient

The only alternative I can see for doing that with ASP.NET is to use a blocking WebClient to request the page that is generating the HTML. Take that output and then turn it into a PDF. Before you do all this, you can obviously check your cache to see if it's in there already.

WebClient client = new WebClient();
string result = client.DownloadString("http://localhost/yoursite");

WatiN (or other browser automation packages)

One other possible solution is WatiN which gives you a lot of flexibility with capturing an browser's HTML. The setback with this is it needs to interact with the desktop. Here's their example:

using (IE ie = new IE("http://www.google.com"))
{
    ie.TextField(Find.ByName("q")).TypeText("WatiN");
    ie.Button(Find.ByName("btnG")).Click();

    Assert.IsTrue(ie.ContainsText("WatiN"));
}
Chris S
Thanks. Webrequest is the solution I went with originally. It's working flawlessly on a local box.My concern with these methods are timeouts - if the server is being hammered beyond reason and the aspx being called locally times out, there could be an issue. Obviously I could hame some sort of timer to retry, but if the user wants it *now*, that's not an option.
diadem