views:

151

answers:

2

I'm trying to change instances of the following line:

URL: http://www.google.com/?s= test

to

URL: <a href="http://www.google.com/?s=%20test"&gt;http://www.google.com/?s= test</a>

note that the anchor url is url encoded

I've managed to get parse the URL part using a very simple regex:

<cfset getFacts.fact_details = REReplace(getFacts.fact_details,
"URL:[ ]*([^#chr(13)##chr(10)#]+)",
"URL: <a href='\1' target='_blank'>\1</a>", "ALL")><!--- URL to newline into link --->

which just grabs the contents after the "URL:" up until a newline

How can I incorporate URLEncodedFormat with this, or use all regex?

+4  A: 

You will need to do this in separate steps, since you can't use function calls in a RegEx.

First, get the URL location using REFind. You already have the regex for that.

Now, use mid() to grab just the URL. Store this in a variable for manipulation. Remove the URL: part, and then perform your URLEncodedFormat() call. I'd store this in a separate var, so you can display the URL as originally entered. Use these two vars to create your replacement (link) string.

Now, you can create your result by using left() and right() to extract what comes before and after your URL and inserting the replacement string between them.

Kind of a PITA, but there it is.

Ben Doom
thanks!i was looking for a quick and easy solution, guess there is none :<
davidosomething
Lots of people ask variations on this because of the function call limitation. The problem is, CF doesn't handle RegEx natively. It hands off to the Java RegEx engine. So, you end up having to do the intermediate steps yourself.
Ben Doom
A: 

Why use regex at all? There are nice list functions that are perfectly up to the job.

<cfoutput>
  <cfset BrokenUrl = "http://www.google.com/?s= test&f=%20foo%20&g&g/=/">
  <cfset FixedUrl  = FixUnencodedUrl(BrokenUrl)>
  #HTMLEditFormat(FixedUrl)#
  <!--- prints: http://www.google.com/?s=%20test&amp;f=%20foo%20&amp;g=&amp;g%2F=%2F --->
</cfoutput>

<cffunction name="FixUnencodedUrl" returntype="string" access="public">
  <cfargument name="UrlStr" type="string" required="yes">

  <cfset var UrlPath  = ListFirst(UrlStr, "?")>
  <cfset var UrlQuery = ListRest(UrlStr, "?")>
  <cfset var NewQuery = "">
  <cfset var part     = "">
  <cfset var name     = "">
  <cfset var value    = "">

  <cfloop list="#UrlQuery#" index="part" delimiters="&">
    <cfset name  = ListFirst(part, "=")>
    <cfset value = ListRest(part, "=")>
    <!--- only encode if not already encoded --->
    <cfif name eq URLDecode(name)>
      <cfset name = URLEncodedFormat(name)>
    </cfif>
    <cfif value eq URLDecode(value)>
      <cfset value = URLEncodedFormat(value)>
    </cfif>
    <!--- build new, encoded query string --->
    <cfset NewQuery = ListAppend(NewQuery, "#name#=#value#", "&")>
  </cfloop>

  <cfreturn ListAppend(UrlPath, NewQuery, "?")>
</cffunction>
Tomalak
I'm pulling text from an nvarchar field in a database that contains something like ---- Text1: Some text some text\nText2: Some text some textURL: http://goodurl.com/\nText3: Some text some text\nURL: http://badurl.com/?s= testing\nText4: Some text some text ---- so you see why this wouldn't work without some even more intense parsing.
davidosomething
@davidosomething: Okay, I see. You could still use the above function along with your regex for finding URLs. It won't double-encode good URLs, so you can feed non-broken URL matches to it, too.
Tomalak