In general
In general terms I tend to agree with the other answers here, but I'd like to add a few remarks. Performance is normally most hindered by its slowest factor, which is the network, the database connection, the file system or even the internal memory when I/O is part of the issue. If we take that as a given, a possible conclusion is that the smaller the size, the bigger the performance improvement is.
Other factors
But there's another factor. Attributes and elements are implemented differently. Attributes are implemented something like key/value pairs with a uniqueness constraint and roughly take the size of chars * 2 + sizeof(int)
. Elements require a much larger structure in-memory and for the sake of brevity, I like to use one simple factor that's some average between several implementations: 3.5 * chars
. I use chars here, because whether you store it as UTF8 or as UTF16 makes a storage difference, but not an in-memory difference.
The former paragraph implies that attributes are faster. But still this isn't a simple fact, because attributes are not implemented as normal nodes and searching for their data is generally slower than searching for data in nodes. This is hard to measure in general terms and requires profiling for every particular situation to find out.
LINQ
Then there's LINQ. If you use LINQ, reading and writing is done with streaming XML which is relatively fast. The in-memory representation is usually much smaller and much faster than with XmlDocument
parsing.
Names
The size of the names of the fields, like elements and attributes does not matter. Internally they are keyed and given a unique ID. The contents of the elements and the attributes, however, will add to the overall memory footprint.
If the size of the names is very large compared to their content, minifying the names will make your XML less readable, but also requires less I/O or network bandwidth. As such, in some cases, it may improve performance to use small names.
UTF-8 or UTF-16
Finally, I should add a note on the way you store it. Common sense says, store it as UTF-8. But that requires the parser to read each character and transform it in-memory to UTF-16. This costs time. Sometimes, a larger size of the file (for using UTF-16) can outperform a smaller size (with UTF-8) because the processor overhead is too big. Again, measuring your performance in several scenarios can help. Oh, and if you use a lot of (very) high characters, UTF-16 should be the preferred choice, because UTF-8 may use 3, 4 or even 6 bytes per character.
Summary
To sum it up, if speed is imperative and you cannot resort to a binary format:
- Prefer attributes over elements, but only if DOM use is anticipated and searching / keying is not too important;
- Prefer UTF-8 over UTF-16 only when the files are very large and you use few (very) high characters, measure to find out;
- Prefer streaming over DOM for all your uses (LINQ typically uses streaming);
- Don't bother using small names unless your I/O is really a bottleneck and the factor data:overhead is very large;
- Define a few typical usage scenarios and measure;
PS: the above is what comes to mind when thinking about XML, there may, of course, be many other factors the improve / degrade performance, the largest perhaps your own skills in writing the best procedures for your CRUD operations.