views:

163

answers:

2

I'm building a small application and to reduce hosting costs and dependencies, I am planning to store all persistent data to xml files rather than a Sql Server database.

In this case, the site audience is limited to friends and family - no more than a few concurrent users are ever expected, so the site does not need to scale. Is it feasible to literally open and close an xml file from disk on all transactions? At most a page might display data from a couple xml files, and occasionally a user will perform an action requiring an update of one.

For example, roughly following the repository pattern for getting and saving "Things," some methods would like like:

    public IEnumerable<Thing> GetThings() {
        XElement xml = XElement.Load(_xmlRepositoryPath);
        var q = from s in xml.Descendants("Thing")
                select new Thing {
                    //set properties...
                };

        return q;
    }

    public void SaveThing(Thing t) {
        XElement xml = XElement.Load(_xmlRepositoryPath);
        //update xml...
        xml.Save(_xmlRepositoryPath);
    }

Any pitfalls or problems with this approach? I'd rather avoid additional complexity of adding an additional caching or in-memory data layer. Extra credit: at what point of user load or transaction levels do think this would need to be implemented differently?

+2  A: 

The main thing that a database will provide, which the file system wont, is atomicity. As soon as you have more than one person accessing your xml file, you need to implement a ReaderWriter lock to make sure that no one's reading whilst you're trying to update the file. It's a non-trivial problem, but one that's solved with most database systems. If you're concerned with cost, then there are any number of opensource solutions.

What ever solution you decide on, make sure you encapsulate all the data access so that changing it isn't so hard.

David Kemp
A: 

In terms of calling Load - you can do it on every hit and the server won't even blink we have sites that are effectively doing pretty much that (load XML, render to HTML using XSLT and parameters based on the URL, deliver to browser, load of the XSLT is explicit or the XML implied by the call to render with transform) and we just don't see issues with them, you'd need concurrent users into the 100s for this to start to be an issue when reading the data.

In terms of doing the file write (save) - don't know but I'd not expect it to be a huge issue, dealing with concurrency (a problem regardless) would be something that would concern me far more than the server load, at your usage levels creative use of app lock might be sufficient, for anything serious this is where the use of XML as a database would become challenging.

As an aside this is an area where ASP.NET clearly rocks - the performance of the server side code - in the general case - is excellent (too good probably).

Murph