tags:

views:

23

answers:

1

I need to find the most optimal solution using Regex to find URL's inside a block of HTML and add a new attribute Name="true" inside the tag.

Below is an example of HTML (which can contain embedded JS), I just need to add the new attribute Name="true" to any URL and NOT effect any embedded JS file.

Example HTML :

<HTML>
    <a href="abc.aspx">
    <a href="abc.aspx">
    <a href="abc.aspx">

    <script type="javascript">
    function{
    if("somefile.aspx")
    {
    do something...
    }
    }
    </script>
</HTML>

Expected HTML :

<HTML>
    <a href="abc.aspx" Name="true">
    <a href="abc.aspx" Name="true">
    <a href="abc.aspx" Name="true">

    <script type="javascript">
    function{
    if("somefile.aspx")
    {
    do something...
    }
    }
    </script>
</HTML>
+1  A: 

Replace /href="([_allowed_characters_in_URL_]+)"/ by 'href="$1" Name="true"'.

Edgar Bonet
So in .Net : Regex rgx = new Regex("href=([.aspx]+)",RegexOptions.Compiled|RegexOptions.IgnoreCase)How will I get the value of href="$1"?
Murtaza RC
You should allow more than `[.aspx]` in the URL, I would accept at least `[A-Za-z0-9_.-]`. You do not have to 'get the value of href="$1"'. You just use `href="$1" Name="true"` literally as the replacement string. The `Replace` method will take care or replacing `$1` with the actual URL.
Edgar Bonet