views:

565

answers:

1

I'm building a website to display products, and categories of products. The data comes from an external service in the form of a 500k XML file. The website is ASP.NET, C#.

The XML is structured as a list of categories. Inside each category might be some products and/or some more categories.

Obviously, we cannot call this external service to get the large XML file each page request, so we call it once every few hours and cache it. I need to do things like this:

  • Display a product category menu on the left hand side of the page
  • Display all products in a selected category
  • Display extended information on a single product

My question(s) are as follows:

Firstly, on a page that displays all products in 'DVDs' category, let's say I do the following (on page load):

XDocument allCategories = Cache["CategoriesXml"];
// loop through the XML and find the DVD category
// Get all products under it, then display them

By bringing the categories XML into a local variable (remember, it's 500k), is that a drain on the server? Remember, I'm going to have to do this each time a page is loaded. There might be thousands of people looking at different pages at the same time. If a thousand people load the same page in the space of a few seconds, will I have a thousand instances of this XML file hanging around in memory? Or will the garbage collector manage all of that for me?

Is it better to loop directly over the cached item, or is that less performant (and/or bad practice)?

Secondly, I said I cache the whole XML file. I get products or categories from that XML by loopiing through it (I use LINQ over XML). Would it be better to create a Category type and a Product type, put them into arrays, and cache that? Then loop through Category objects and arrays rather than an XDocument? What would be more performant?

Thirdly, what would you say is the best practice in terms of how to architect this system. Let's say I have a Data Access Layer, a Business Objects Layer, and the Web Application. Where should I put the reference to the external service, to retrieve the XML? Which layer should I Cache it? Does this application even have a Data Access layer, in the sense that the DAL part is done by some other system? Currently my DAL is only there for accessing our database, and it feels wrong to put a webservice refernce in there - but maybe it's not? Is it bad practice to use caching in the Business layer (i.e. interferes with unit tests, etc)? I have considered a middle layer between the web and the business layer, solely for caching - is this a good or bad idea?

I have already finished the bulk of this website, actually - I'm simply looking back at it now and wondering if I have done it the best way, so want to compare your suggestions with what I have actually done, and hopefully I will be able to go back and improve it.

Thanks!

+2  A: 

I've done something similar to this and I would put your caching logic in your data layer. This will remove all caching logic from your business and presentation layers.

I'm not familiar with ASP.NET, but my guess is yes, every page request will result in the XMl file being loaded separately.

And I would not store the XML in memory. You should store the data you need in memory. There are probably ASP.NET libraries for using in-memory caches like Memcached, where you can store serialized versions of objects.

Ryan Doherty
+1. You can even add another level of abstraction and make the caching a passthrough to the cache, which I have seen. The benefit is that the caching passthrough can be written to be used with multiple DALs (for multiple applications).
joseph.ferris
@Joseph: what do you mean by "make the caching a passthrough to the cache" - can you explain this with an example?
Alex York