views:

95

answers:

1

I'm currently coding a file reader for fixed-width tables in VB, and the compiled application seems to be sucking down memory like there's no tomorrow. I'm working with a series of ~50 megabyte files, but after running through several, the process starts taking up about 200+ megabytes of RAM, which is way more than it should.

I've done some poking around, and I think the issue is the call to NewRow(), but don't take my word for it.

Does anyone have some tips for optimizing this? If the problem's with the NewRow() call, is there a way of clearing this out?

Code follows below:

Function LoadFixedWidthFileToDataTable(ByVal filepath As String, ByRef Colnames() As String, ByRef colwidth() As Integer) As DataTable

    Dim filetable As New DataTable
    For Each name As String In Colnames
        filetable.Columns.Add(name)
    Next

    Dim loadedfile As StreamReader
    Try
        loadedfile = New StreamReader(filepath)
    Catch io As IOException
        MsgBox(io.Message)

        Return Nothing
        Exit Function
    End Try

    Dim line As String = loadedfile.ReadLine
    Dim filerow As DataRow = filetable.NewRow
    Dim i As Integer = 0

    While Not loadedfile.EndOfStream
        line = loadedfile.ReadLine
        filerow = filetable.NewRow
        i = 0


        For Each colsize As Integer In colwidth
            Try
                filerow(i) = line.Substring(0, colsize)
                line = line.Remove(0, colsize)
            Catch ex As ArgumentOutOfRangeException ''If the line doesn't match array params
                Exit For
            End Try
            i = i + 1
        Next
        filetable.Rows.Add(filerow)
    End While

    loadedfile.Close()
    Return filetable
End Function
+1  A: 

Mikurski,

I see one problem right off the bat. You are Dim'ing the Stream reader. You should enclose this in a using block, or be sure you do a .Dispose at the end of every code path. Anything that implements IDisposable interface should be disposed or used in a using block as follows:

 Using fs As New FileStream("C:\testing.txt", FileMode.Open)
  Using sr As StreamReader = New StreamReader(fs)
   Dim message As String = sr.ReadLine()
   MessageBox.Show(message)
  End Using
 End Using

I hope this helps,

Thanks!

EDIT:

Here are the items that implement IDisposable that your code isn't disposing:

  • DataTable
  • StreamReader

Hope this helps,

Thanks again!

Here is a function that is less memory intensive:

Function LoadFixedWidthFileToDataTable(ByVal filepath As String, ByRef Colnames() As String, ByRef colwidth() As Integer) As DataTable
 Dim filetable As New DataTable
 Try
  For Each name As String In Colnames
   filetable.Columns.Add(name)
  Next

  Try
   Using loadedfile As StreamReader = New StreamReader(filepath)
    Dim line As String = loadedfile.ReadLine
    Dim filerow As DataRow = filetable.NewRow
    Dim i As Integer = 0

    While Not loadedfile.EndOfStream
     line = loadedfile.ReadLine
     filerow = filetable.NewRow
     i = 0

     For Each colsize As Integer In colwidth
      Try
       filerow(i) = line.Substring(0, colsize)
       line = line.Remove(0, colsize)
      Catch ex As ArgumentOutOfRangeException
       Exit For
      End Try
      i = i + 1
     Next

     filetable.Rows.Add(filerow)
    End While

    loadedfile.Close()

   End Using
  Catch io As IOException
   MsgBox(io.Message)

   Return Nothing
  End Try
  Return filetable
 Catch ex As Exception
  filetable.Dispose()
 End Try

 Return Nothing
End Function

I notice that you're returning the thing that you should be disposing. If you're doing this, you cannot dispose of it here. Instead, the place this function is called from must dispose of it. If you wrap a using block around this you will have issues as the returned object will be disposed before you have a chance to act upon it from the calling code.

Hope this helps,

Thanks again!

Scott
Thanks! I'm guessing that the outside call should therefore look likeUsing ThisTable as Datatable = LoadFixedwidthFiletoDatatable()...stuff...End Usingshould take care of the disposal, or should I be calling ThisTable.Dispose(), just to be safe?
mikurski
Calling Dispose is more explicit, but not needed if you're code is using the "Using" block. When your code exits the "Using" block it calls the dispose function for you. So, for example, if your code started a Using block, had an error, and "jumped" out of the using block to the exception handler, it would then call the Dispose method. However, "jumping" out of your using block via Function call is OK, cus you will be returned back to your calling code. Even then, if an exception happens in the function, you will be fine cus the code will walk the stack and exit the using block correctly.
Scott
Just checked a test run through perfmon; works great! Thanks again!
mikurski
No problem. Thanks for the accepted answer. :-)
Scott