ansaurus

Question

When querying with LINQ-to-XML, is it better/more efficient to leave element values as strings or convert them to the correct type?

Answer 1

+1 A:

Its difficult to assess the performance issues here without measuring. But I think you have two scenarios.

If you need to use most (or all) of the values in an expression sooner or later, then it is probably best to pay the CPU costs of converting to native types up front - discarding the XML string data early.
If you are only going to touch (evaluate or use) a few of the values, then it will most likely be cheaper in terms of CPU time to convert string data to native types lazily - at the time of (or close to it temporally) consumption.

Now, this is just the CPU time considerations. I suggest that it is likely that the data itself will take up considerably less memory once converted to native value types. This lets you discard the string (XML) data early.

In short, it is rare for questions like this to have black or white answers: it will depend on your scenario, the complexity of the data, how much data there is, and when it will be used (touched or evaluated).

Update

In Dan's comment to my original answer, he ask for a general rule of thumb in cases where there is not time, or reason to do detailed measurements.

My suggestion is to prefer conversion to native types at XML parsing time, not keep the string data around and parse lazily. Here is my reasoning

The code will already be burning some CPU, I/O, and memory resources at parasing time.
The code is like to be simpler doing the conversions at load time (rather than at another time) as this can all be coded in a simple procedural way.
This is likely to be more memory efficient as well.
When the data needs to be used, it is already in a native format - this will be much better performing than dealing with string data at consumption time: comparisons and computation with native types will usually be much more efficient than dealing with data in string format. This is likely to keep the consuming code simpler as well.

Again, I'm suggesting this as a rule of thumb :) There will be scenarios where another approach is more optimal from a performance standpoint, or will make the code 'better' in some way (more cohesive, modular, easier to maintain, etc).

This is one of those cases where you will most likely need to measure the results to be sure you are doing the right thing.

Foredecker 2009-07-25 21:52:01

That's a great and thoughtful answer, but I guess I was looking for a nice rule of thumb since I don't have time to do a lot of performance testing on my queries. Thanks.

DanM 2009-07-25 23:51:25

Answer 2

+1 A:

I agree with your second edit. If performance is an issue, you will gain much more by using a more queryable data structure (or just cache a dictionary by ID from your XML for repeated lookups) than by changing how you compare/parse values.

That said, my preference would be using the various explicit cast overrides on XElement. Also, if your ID could ever be empty (better safe than sorry), you can also do an efficient cast to int?.

dahlbyk 2009-07-26 03:30:00

ansaurus

tags:

views:

answers:

When querying with LINQ-to-XML, is it better/more efficient to leave element values as strings or convert them to the correct type?

Update

related questions