This isn't an answer, more an addition to this question - I have just come across the same problem and I can give figures as edg asks for in the comment.
My test has xml which results in 244 records being inserted - so 244 nodes.
The code that I am rewriting takes on average 0.4 seconds to run.(10 tests run, spread from .56 secs to .344 secs) Performance is not the main reason the code is being rewritten, but the new code needs to perform as well or better. This old code loops the xml nodes, calling a sp to insert once per loop
The new code is pretty much just a single sp; pass the xml in; shred it.
Tests with the new code switched in show the new sp takes on average 3.7 seconds - almost 10 times slower.
My query is in the form posted in this question;
INSERT INTO some_table (column1, column2, column3)
SELECT
Rows.n.value('(@column1)[1]', 'varchar(20)'),
Rows.n.value('(@column2)[1]', 'nvarchar(100)'),
Rows.n.value('(@column3)[1]', 'int'),
FROM @xml.nodes('//Rows') Rows(n)
The execution plan appears to show that for each column, sql server is doing a separate "Table Valued Function [XMLReader]" returning all 244 rows, joining all back up with Nested Loops(Inner Join). So In my case where I am shredding from/ inserting into about 30 columns, this appears to happen separately 30 times.
I am going to have to dump this code, I don't think any optimisation is going to get over this method being inherently slow. I am going to try the sp_xml_preparedocument/OPENXML method and see if the performance is better for that. If anyone comes across this question from a web search (as I did) I would highly advise you to do some performance testing before using this type of shredding in SQL Server