I have two documents with a simple schema that I need to compare:
current doc:
<Sections>
<Section Number="1"/>
<Section Number="2"/>
<Section Number="4"/>
<Section Number="5"/>
</Sections>
previous doc:
<Sections>
<Section Number="1"/>
<Section Number="2"/>
</Sections>
The result of the comparison will be a list sections that have been added to the current doc...ie sections in the current doc that are not in the previous doc. In this example section 4 and 5 are new.
The current and previous doc can have upwards of 20,000 records. The following approach produces the results I need but seems like the wrong approach as it passes over the data sets multiple times, and takes a while to run.
get a list of the sections:
List<XElement> currenList = currentDoc.Descendants("Section").ToList();
get attributes in previous list
List<string> previousString = //get the attribute values...
//get the new sections...
var newSections = (from nodes in currentList
let att = nodes.Attribute("Number").Value
where !previousList.Contains(att)
select nodes)
What is a better approach that would involve fewer passes/conversions of the datasets?