views:

259

answers:

1

Do anyone come across a performance issue when deleting a first row in a 20,000+ rows Excel file using OpenXML SDK v2.0?

I am using the delete row coding suggested in the Open XML SDK document. It takes me several minutes just to delete the first row using Open XML SDK, but it only takes just a second in Excel applicaton.

I eventually found out that the bottle-neck is actually on the bubble-up approach in dealing with row deletion. There are many rows updating after the deleted row. So in my case, there are around 20,000 rows to be updated, shifting up the data row by row.

I wonder if there is any faster way to do the row deletion.

Do anybody have an idea?

+2  A: 

Well, the bad news here is: yep, that's the way it is.

You may get slightly better performance moving outside of the SDK itself into System.IO.Packaging and just creating an IEnumerable/List in like Linq-to-XML of all the rows, copy that to a new IEnumerable/List without the first row, rewrite the r attribute of <row r="?"/> to be it's place in the index, and the write that back inside <sheetData/> over existing children.

You'd need to kind of do the same for any strings in the sharedStrings.xml file - i.e. removing the <ssi>.<si> elements that were in the row that was deleted, but in this case they are now implicitly indexed, so you'd be able to get away with just outright removing them.

Otaku