I'm building a website to display products, and categories of products. The data comes from an external service in the form of a 500k XML file. The website is ASP.NET, C#.
The XML is structured as a list of categories. Inside each category might be some products and/or some more categories.
Obviously, we cannot call this external service to get the large XML file each page request, so we call it once every few hours and cache it. I need to do things like this:
- Display a product category menu on the left hand side of the page
- Display all products in a selected category
- Display extended information on a single product
My question(s) are as follows:
Firstly, on a page that displays all products in 'DVDs' category, let's say I do the following (on page load):
XDocument allCategories = Cache["CategoriesXml"];
// loop through the XML and find the DVD category
// Get all products under it, then display them
By bringing the categories XML into a local variable (remember, it's 500k), is that a drain on the server? Remember, I'm going to have to do this each time a page is loaded. There might be thousands of people looking at different pages at the same time. If a thousand people load the same page in the space of a few seconds, will I have a thousand instances of this XML file hanging around in memory? Or will the garbage collector manage all of that for me?
Is it better to loop directly over the cached item, or is that less performant (and/or bad practice)?
Secondly, I said I cache the whole XML file. I get products or categories from that XML by loopiing through it (I use LINQ over XML). Would it be better to create a Category type and a Product type, put them into arrays, and cache that? Then loop through Category objects and arrays rather than an XDocument? What would be more performant?
Thirdly, what would you say is the best practice in terms of how to architect this system. Let's say I have a Data Access Layer, a Business Objects Layer, and the Web Application. Where should I put the reference to the external service, to retrieve the XML? Which layer should I Cache it? Does this application even have a Data Access layer, in the sense that the DAL part is done by some other system? Currently my DAL is only there for accessing our database, and it feels wrong to put a webservice refernce in there - but maybe it's not? Is it bad practice to use caching in the Business layer (i.e. interferes with unit tests, etc)? I have considered a middle layer between the web and the business layer, solely for caching - is this a good or bad idea?
I have already finished the bulk of this website, actually - I'm simply looking back at it now and wondering if I have done it the best way, so want to compare your suggestions with what I have actually done, and hopefully I will be able to go back and improve it.
Thanks!