views:

17

answers:

1

Hey guys!

I have this huge xml file (13 mb) and it has some malformed values. Here is a sample of the xml:

<propertylist>
        <adprop index="0" proptype="type" value="Ft"/>
        <adprop index="0" proptype="category" value="Bs"/>
        <adprop index="0" proptype="subcategory" value="Bsm"/>
        <adprop index="0" proptype="description" value="MOONEN CUSTOM 58"/> 
</propertylist>

Now this is ok. But I many other nodes that are not encapsulated in CDATA that need to be. The node that gives me problems is the

<adprop index="0" proptype="description" value=""/> 

I created this regular expression:

<adprop index="0" proptype="description" value="(.+)"\/>

to catch that node and replace it with this:

<adprop index="0" proptype="description" value="<![CDATA[\1]]>"\/>

I run this in notepad++ and it works.

The only problem is when the value="" is multi lined like:

  <adprop index="0" proptype="description" value="cutter that has demonstrated her offshore capabiliti from there to the Canaries with her current owner. 

Spacious homely interior with over 2m headroom and heaps of" />

It fails with this one, and there are plenty like this one.

Can anyone help me out in the regular expression so that I can catch the value when it's multi lined?

Thanks

A: 

Try adding \r or \n to your regular expression to include newlines as the dot character matches "any character except newlines." I am not sure what regular expression syntax Notepad++ takes, but it should list this in help. (The editor I use, UltraEdit, will allow newlines in its regex engine.)

JYelton
I tried using \n but it still doesn't match. I placed the string in regex editor and it only matches when it's a single line :(
AntonioCS