ansaurus

Question

How do I use ColdFusion to replace text in HTML without replacing HTML tags?

Answer 1

+1 A:

what you have to do is use a lookahead to make sure that your text isn't contained within a tag. granted this could probably be written better, but it will get you the results you want. it will even handle when the tag has attributes.

<cfset html =  "<span class='me'>Text goes here, forr example it container also **span** </span>" />
<cfset wordToReplace = "span" />
<cfset html = ReReplace(html ,"(?!/?<)(#wordToReplace#)(?![^.*>]*>)","replaced","ALL")>

rip747 2010-07-09 13:53:40

Your `(?!/?<)` is back-to-front (will match `/<`) and the `.*` inside `[^.*>]` are literal characters. But even corrected, for a trivial example of how this doesn't work... consider what `<img title="spanish > forever" src="span.png" alt="spanish flag" />` will become.

Peter Boughton 2010-07-09 17:34:15

Answer 2

+3 A:

I need a Regular Expression which does not highlights tags, but obly text within the tags.

You wont find one. Not one that is fully reliable against all legal/wild HTML.

The simple reason is that Regular Expressions match Regular languages, and HTML is not even remotely a Regular language.

Even if you're very careful, you run the risk of replacing stuff you didn't want to, and not replacing stuff you did want to, simply due to how complicated HTML syntax can be.

The correct way to parse HTML is using a purpose-built HTML DOM parser.

Annoyingly CF doesn't have one built in, though if your HTML is XHTML, then you can use XmlParse and XmlSearch to allow you to do an xpath search for only text (not tags) that match your text... something like //*[contains(text(), 'span')] should do (more details here).

If you've not got XHTML then you'll need to look at using a HTML DOM parser for Java - Google turns up plenty, (I've not tried any yet so can't give any specific recommendations).

Peter Boughton 2010-07-09 17:25:44

+1 - the quoted part of the question amounts to, "How can I make the wrong tool for the job do the job?"

Joel Mueller 2010-07-09 22:05:59

ansaurus

tags:

views:

answers:

How do I use ColdFusion to replace text in HTML without replacing HTML tags?

related questions