tags:

views:

207

answers:

3

Hi All,

I am a c# dev working on some code for a website in vb.net. We use a lot of caching on a 32bit iss 6 win 2003 box and in some cases run into OutOfMemoryException exceptions. This is the code I trace it back to and would like to know if anyone else has has this...

Public Sub CreateQueryStringNodes()
    'Check for nonstandard characters'
    Dim key As String
    Dim keyReplaceSpaces As String
    Dim r As New Regex("^[-a-zA-Z0-9_]+$", RegexOptions.Compiled)

    For Each key In HttpContext.Current.Request.Form
        If Not IsNothing(key) Then
            keyReplaceSpaces = key.Replace(" ", "_")
            If r.IsMatch(keyReplaceSpaces) Then
                CreateNode(keyReplaceSpaces, HttpContext.Current.Request(key))
            End If
        End If
    Next

    For Each key In HttpContext.Current.Request.QueryString
        If Not IsNothing(key) Then
            keyReplaceSpaces = key.Replace(" ", "_")

            If r.IsMatch(keyReplaceSpaces) Then
                CreateNode(keyReplaceSpaces, HttpContext.Current.Request(key).Replace("--", "-"))
            End If
        End If
    Next
End Sub

.NET Framework Version:2.0.50727.3053; ASP.NET Version:2.0.50727.3053

error:

Exception of type 'System.OutOfMemoryException' was thrown. at Go60505(RegexRunner ) at System.Text.RegularExpressions.CompiledRegexRunner.Go() at System.Text.RegularExpressions.RegexRunner.Scan(Regex regex, String text, Int32 textbeg, Int32 textend, Int32 textstart, Int32 prevlen, Boolean quick) at System.Text.RegularExpressions.Regex.Run(Boolean quick, Int32 prevlen, String input, Int32 beginning, Int32 length, Int32 startat) at System.Text.RegularExpressions.Regex.IsMatch(String input) at Xcite.Core.XML.Write.CreateQueryStringNodes() at Xcite.Core.XML.Write..ctor(String IncludeSessionAndPostedData) at mysite._Default.Page_Load(Object sender, EventArgs e) at System.Web.UI.Control.OnLoad(EventArgs e) at System.Web.UI.Control.LoadRecursive() at

thanks

+1  A: 

There's a couple of articles out there that pretty much say avoid using Compiled, it doesn't really mean what people think it means sometimes. If I understand correctly, using Compiled actually permanently eats up memory for the duration of the application. Since you're on the web, the application's lifetime might be fairly long. This was supposed to be fixed/addressed in the 2.0 Framework but it looks like it hasn't.

Chris Haas
The regex could be made static: `Private Shared r As New Regex("^[-a-zA-Z0-9_]+$", RegexOptions.Compiled)` to avoid emitting code each time `CreateQueryStringNodes` is called.
0xA3
@0xA3, could point, too. I just ran a loop of 10,000 cycles against a Dim/Static RegEx declaration, the Dim took 14 seconds to complete the Static took 6 _milliseconds_ to complete. I wish more people knew when to use static variables.
Chris Haas
could -> good above
Chris Haas
You Quote Coding Horror...I believe he was talking about .Net 1.0 and 1.1. Version 2.0 and above does things differently. To quote MSDN (<a href="http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex.aspx">Regex Class</a>)In the .NET Framework versions 1.0 and 1.1, all compiled regular expressions, whether they were used in instance or static method calls, were cached. Starting with the .NET Framework 2.0, only regular expressions used in static method calls are cached.
OmegaMan
@OmegaMan, that's actually another different good point, too. Instance methods don't get the benefit of the RegEx cache, only Static methods do (which is counter-intuitive to me). Below's an article that explains that they fixed the memory problem but still they recommend avoiding Compile: "Therefore, the general approach to compiled Regexes should be to only use this mode for a finite set of expressions that you know will be used repeatedly. Even more specifically, avoid this mode except for one or two key Regexes..." http://msdn.microsoft.com/en-us/magazine/cc163670.aspx#S7
Chris Haas
I agree with the article and your comments as well, up to a point, but since the OP has a call to the regex processing in two separate tight loops, the memory footprint and upfront loading are worth the cost since we don't know how often this method is actually called.... IMHO: I generally go with the static method and would rewrite the code as such, getting the benefit of the caching mechanism.
OmegaMan
hi thanks for all the input, the static object performed slightly better than the pre compiled dll, turns out that the strings are not that big so this is not a bottleneck for us. I will change to static onject thought. ps by adding Trace="true" to the aspx page this disables response.output caching!!! so if you see a user that says they are running iis 6 and kernel/output caching isnt working it might be because they have tracing enabled.thanks again...
Deanvr
A: 

So once the code replaces all the spaces with '_', it then matches them? Is this some kind of validation for alpanumeric with a dash or underscore?

How much data is being processed by the regex parser?

Note your pattern (as it is now) can be changed to

^[\w-_]+$

(Note your use of Regex Compiled is perfect for this situation, unlike what the other post says).

To find out about compilation check out To truly understand what is going on see: Base Class Library Performance Tips and Tricks at the section Regular Expression Compilation. HTH

OmegaMan
thanks as well OM
Deanvr
A: 

thanks! let me me check a few things in the office tomorrow, I might try a combination of both solutions - I will write some tests and let you know. the regex parser is acting on a long string, the create node is populating a list object that is later used like xml for xslt transformation for html output, we are using response output caching on the final output. I dont think the static variable will make a difference as this is a asp.net app the CreateQueryStringNodes is only called once through the page life cycle for each page. I might put it into app cache and use it as a singleton but the work to fetch it might be less efficient - let me test it...

thanks again.

dvr