tags:

views:

57

answers:

6

I made a huge error in a gigantic XML file.

<item1>
    <item2>
        <item1>
            //.. tons of stuff...
        </item1>
    </item2>
</item1>

I need to replace the outer item1 with something else. But find and replace isn't working because of the matching inner item1. I've tried searching by multiple pieces of information, but the single-line nature of every find and replace I find makes it impossible, and all of the data is tabbed.

Any ideas?

+2  A: 

Can you use the tabs to your advantage? If it is as regular as your example then you can probably search and replace on \t\t\t<item> (or whatever syntax you need to search with tabs) with whatever else you need.

Jeff Foster
You are replacing the inner ones, but it is an idea. Also, it could be using spaces.
Martinho Fernandes
I'm using Visual Studio to edit the XML. What could I do the regular expression replacement in?
Stacey
Press Ctrl+F then check the "Use regular expressions" checkbox. There are arrows next to the textboxes that when you press then show a menu with common VS regex stuff.
Martinho Fernandes
Success! I used RegexBuddy to make this replacement. It worked beautifully. Thank you very, very much.
Stacey
I tried using Visual Studio. it didn't take to any of the expressions.
Stacey
+2  A: 

If you can match a regular expression then you could match:

<item1>\n[any whitespace]<item2>

and change it to:

<item3>\n[any whitespace]<item2>

and the same for

</item1>\n[any whitespace]</item2>

and change it to:

</item3>\n[any whitespace]</item2>

I haven't specified the [any whitespace] expression as I know it's different for different editors.

ChrisF
I'm inexperienced with RegEx. What would I execute this kind of thing in?
Stacey
@Stacey - it depends on your editor. Visual Studio can do it - it's an option on the search and replace dialog, as can Notepad++. What editor are you using?
ChrisF
A: 

Use a regular expression search or other advanced search/replace method to replace the inner <item1> tag with something else temporarily (by specifying the tab characters before it as well). Then replace the remaining item1 tags, which will now be the outer ones, before changing your temporary ones back again.

David M
A: 

If the XML is all formatted like that, you should be able to use regular expressions. You might also try formatters to get that format.

Otherwise you could read the XML with an XML-parser in a language you know, change it there and write it back to disk.

Peter Lang
A: 

xmlstarlet can help.

Ignacio Vazquez-Abrams
+1  A: 

Regex may have worked in this instance, but regex is generally NOT the best means of modifying XML.

XML is not regular. You should use XML tools to parse and manipulate XML data, or you will likely run into problems at some point.

Transforming the XML using an XSLT identity transform with a template for the particular "item1" element is one example that would be a more safe, robust solution:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    version="1.0">

    <xsl:template match="/">
        <xsl:apply-templates />
    </xsl:template>

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="item1[item2/item1]" >
        <!--Replace this literal element "NEW_ITEM_ELEMENT" with whatever name you need to change "item1" elements to: -->
        <NEW_ITEM_ELEMENT>
            <xsl:apply-templates />
        </NEW_ITEM_ELEMENT>
    </xsl:template>

</xsl:stylesheet> 
Mads Hansen