views:

57

answers:

4

So, I have this code that grabs a bunch of data from the database, and performs a bunch of calculations on that data. However, this is not what is causing the halt. The halt is coming in when I take all that "final" data that has been prepared, and go to write that into a text file.

Each line in the text file is being created from data that has had calculations performed on it. Each line "corresponds" to a record from the database. A user will have to perform this file export usually on about 30-40 thousand records at a time. Now, even though I am using a StringBuilder, this still halts because of the pure mass amount of records.

What I do currently in the code is create one StringBuilder, append all the data (after calcs) to it, convert it to a String, add it to a List(Of String) to be returned from the Function, and then clear the StringBuilder, and do the same for the next row, and I do this until each row has been created and appended, converted, and then inserted into the List.

Then, another class receives this List(Of String) and takes each String and eventually appends it to a file.

Does anyone have an idea of how to improve this code and make it so it doesn't halt the server every time? The reason it wasn't found out until now is because when I tested it locally I was testing it on only 1-50 records at a time.

Thanks ahead of time!

Edit:

r = Class.ExportFile(ID)
data = Encoding.ASCII.GetBytes(r.ResponseString)

Return File(data, "text/plain", r.DefaultFileName)

-Scott

+2  A: 

It certainly sounds to me that you would be better off writing much of the data straight out into a file - by all means write a line at a time to a StringBuilder, but then write the resultant string out to the file that you're ultimately exporting to. This'll keep memory usage down, and if you're using a StreamWriter the writes will be buffered and performance should not suffer noticeably.

Will A
I actually would prefer to do something as simple as this, because this would require not too much change to the code, however, we use a pretty separated ASP.NET MVC architecture, and they have kind of separated the logic from the classes I use to return the String that will be written out to the file, and the classes that actually write it (they are used in multiple places)
Scott
@Scott: Then pass a stream into your class where the data is to be written. You can then swap out a FileStream or a MemoryStream or anything else and the code that does the writing doesn't have to change.
Billy ONeal
Hm, makes no sense. Your code could return (!) an IEnumerable<string> to the outer class that this could use to get the resulting string line by line and write it out. Separation honoured, still not holding the complete string in memory.
TomTom
@Scott: Or you could pass an `IEnumerable<string>` to the class that does the writing if it should always write lines of strings and use an iterator block to generate each line as they're written.
Michael Petito
+3  A: 

Return an IEnumerable instead of a big string or a list of strings and write a line at a time to the file. That way you can evaluate the data lazily, so you don't need to keep huge strings around. e.g.

IEnumerable<string> Process() {
   var rows = QueryTheDatabase();
   foreach(var row in rows) {
      yield return ProcessARecord(row);//process and build a string of
                                         //one row
    }
 }

The caller just does a foreach over the returned IEnumerable and writes each line to a file.

nos
Well we have the string being returned to another method where it calls -- data = Encoding.ASCII.GetBytes(OURSTRING) -- and then later it converts it to the file by doing -- File(data, "text/plain", filename) -- How would this need to change?
Scott
Do you return one *gigantic* string that's the concatenation of all the data, or do you return a list of strings, one string for each of your 40k records ? In the latter case you don't need to do anything in this regard ,and what are you doing with the result - just writing it to a file ?
nos
One method creates a list of strings. then this method passes to another method this list, which appends each of those to a stringbuilder. then it converts the whole thing to a string and passes it to what you see above.
Scott
Its using the MVC FileContentResult from the controller to finally return the file. If that helps. Someone mentioned kicking off the process in a separate .exe? I have no clue how to do that.
Scott
I added an edit for how our code works. All the calculations and appending is happening in the ExportFile call, that creates a result with a huge string, as well as the filename.
Scott
You might be better off writing the file to the filesystem in this case instead of serving it all from in-memory, and return a redirect to that physical file.
nos
A: 

You're probably hitting memory fragmentation issues. Playing with REALLY large dynamic data structures in a 32-bit world tends to cause this. Personally I have only hit this with a 300mb string but I normally avoid getting into the situation so that doesn't say too much.

As others have said, break it up so you don't have such a large string.

Loren Pechtel
A: 

Use an IEnumerable/iterator block rather than a List. The conversion is pretty simple: change the return type of the function from List<string> to IEnumerable<string> and then where ever you currently call your list's .Add() method change the line to use yield return instead. Then also change your other class to accept an IEnumerable<string> rather than List<string>. You may need to do a similar conversion to other methods as well, so that the entire chain from the database is processed this way.

What this will do is make it so that you only keep one record in memory at a time. Each record will be converted to a string and processed by your output method on the fly.

Joel Coehoorn