views:

837

answers:

2

The Uri class defaults to RFC 2396. For OpenID and OAuth, I need Uri escaping consistent with RFC 3986.

From the System.Uri class documentation:

By default, any reserved characters in the URI are escaped in accordance with RFC 2396. This behavior changes if International Resource Identifiers or International Domain Name parsing is enabled in which case reserved characters in the URI are escaped in accordance with RFC 3986 and RFC 3987.

The documentation also states that activating this IRI mode and thus the RFC 3986 behavior means adding a uri section element to machine.config and this to your app/web.config file:

<configuration>
  <uri>
  <idn enabled="All" />
  <iriParsing enabled="true" />
  </uri>
</configuration>

But whether this is present in the .config file or not, I'm getting the same (non-3986) escaping behavior for a .NET 3.5 SP1 app. What else do I need to do to get Uri.EscapeDataString to use the RFC 3986 rules? (specifically, to escape the reserved characters as defined in that RFC)

A: 

What version of the framework are you using? It looks like a lot of these changes were made in the (from MSDN) ".NET Framework 3.5. 3.0 SP1, and 2.0 SP1" timeframe.

Marc Gravell
I've added that I'm using .NET 3.5 SP1 to my question. I note with some amusement that the MSDN article you link too is grossly inconsistent with itself, having invalid XML, Uri and uri interchangably used when case sensitivity matters, and <idn enabled=true> when the value is supposed to be "all" instead of "true", as the doc itself demonstrates later. :)
Andrew Arnott
+5  A: 

Having not been able to get Uri.EscapeDataString to take on RFC 3986 behavior, I wrote my own RFC 3986 compliant escaping method. It leverages Uri.EscapeDataString, and then 'upgrades' the escaping to RFC 3986 compliance.

/// <summary>
/// The set of characters that are unreserved in RFC 2396 but are NOT unreserved in RFC 3986.
/// </summary>
private static readonly string[] UriRfc3986CharsToEscape = new[] { "!", "*", "'", "(", ")" };

/// <summary>
/// Escapes a string according to the URI data string rules given in RFC 3986.
/// </summary>
/// <param name="value">The value to escape.</param>
/// <returns>The escaped value.</returns>
/// <remarks>
/// The <see cref="Uri.EscapeDataString"/> method is <i>supposed</i> to take on
/// RFC 3986 behavior if certain elements are present in a .config file.  Even if this
/// actually worked (which in my experiments it <i>doesn't</i>), we can't rely on every
/// host actually having this configuration element present.
/// </remarks>
internal static string EscapeUriDataStringRfc3986(string value) {
    // Start with RFC 2396 escaping by calling the .NET method to do the work.
    // This MAY sometimes exhibit RFC 3986 behavior (according to the documentation).
    // If it does, the escaping we do that follows it will be a no-op since the
    // characters we search for to replace can't possibly exist in the string.
    StringBuilder escaped = new StringBuilder(Uri.EscapeDataString(value));

    // Upgrade the escaping to RFC 3986, if necessary.
    for (int i = 0; i < UriRfc3986CharsToEscape.Length; i++) {
     escaped.Replace(UriRfc3986CharsToEscape[i], Uri.HexEscape(UriRfc3986CharsToEscape[i][0]));
    }

    // Return the fully-RFC3986-escaped string.
    return escaped.ToString();
}
Andrew Arnott