ansaurus

Question

What is the fastest way to remove nodes from a large XML file using .net

Answer 1

A:

You could use perl or shell scripting to replace out the required items if you can write a quick regular expression to get rid of it. That would avoid loading the whole thing into memory and writing it back out.

Matt 2010-01-11 18:04:39

In general, regular expressions cannot be used to match XML (or HTML), because they are not regular languages.

John Saunders 2010-03-09 09:22:33

Answer 2

+1 A:

You might be able to save a step by implementing a subclass of XmlReader whose Read method skips over the item elements you're not interested in. Right now, you seem to have two steps: reading and filtering the document with an XmlReader and then using XmlWriter to write it to something that you presumably then read it from. Subclassing XmlReader eliminates that second step; you use the subclassed XmlReader as the input to your XSLT transform or XmlDocument or whatever, and it never builds an intermediate representation of the filtered XML document.

Robert Rossney 2010-01-11 18:11:22

This may work, but once i read forward, if my item is good, i'll need to move my "cursor" back to the start of the item. How do i do that?

Pasha 2010-01-11 18:35:20

Well, there's (at least) two ways. You can have your XmlReader check its Stream's CanSeek property at creation and throw an exception if it can't seek; then you know you can save the position in the Stream when you start parsing an element, and if the element's good you can parse it again. The better way is to build some kind of intermediate representation for each node - the XmlNodeType, Name, Value, etc. - and save it in a list. Then either throw the list a way or update the XmlReader's properties from the next item in the list when Read is called.

Robert Rossney 2010-01-13 02:04:58

Answer 3

A:

see if you can use xpath querys to determine what you want to and dont want to read with that xmldocument object....look into the following methods of that class SelectSingleNode() which returns an XmlNode object... SelectNodes() which returns an XmlNodeList object.... see if that helps....

kd 2010-01-11 18:22:18

Answer 4

A:

This URL has the answer you look for

http://stackoverflow.com/questions/62423/how-to-update-large-xml-file

vtd-xml-author 2010-01-11 19:45:15

Note that Mr. Zhang is the author of VTD-XML.

John Saunders 2010-03-09 09:23:01

ansaurus

tags:

views:

answers:

What is the fastest way to remove nodes from a large XML file using .net

related questions