views:

664

answers:

5

Update: I probably confused memory usage issues with the UI sharing same thread as the processing (as pointed out by MusiGenesis below). However regarding the Memory usage. I am still not able to find VB.net specific syntax, although people have pointed out some great .Net and C# information below (and if I were more versed in those technologies, one could adapt to make work with VB.net).

I am creating a VB.Net application.

  • The application basically Parses Data Files located on the client machine into DataSet/DataTables.
  • Then using DataView, it breaks down the DataTables into manageble chunks, writes to XML and sends the XML data to a webservice.

The general concepts are working fine, however I am having issues where the Mem Usage on Task Manager keeps growing as the program is used to load more and more files.

On Startup, before doing anything, the VB application has 27,000 K. Once the file is parsed and even after I dispose of the File handle as well as the the data increases a lot. I strip out everything in the code and it still seems that memory in Mem Usage remains captured. There is no rhyme or reason as to why the Mem Usage is growing (i.e. sometimes it can grow by 20 mb when reading a 7mb file, however other times it does not increase at all, when reading a 3mb file). Sometimes, it appers to release some memory when the parsing is complete and other times it just holds.

I have looked at .Net Memory Profiler and have not really been able to make heads or tails from that.
I have read a lot on internet regarding Memory Management on .Net in General about Dispose and "Nothing" and DataSets, etc, however have not really found anything regarding VB.Net specifically.

My General Question is: Are there any good tutorials/books/blogs/etc that show a more in depth tutorial on managing memory in a VB.Net application (i.e. how/when to dispose/close, etc), or does anyone have some specific tips from there experience.

+1  A: 

Memory management in VB.Net is actually handled by the .Net Framework, so in general, it's the same in VB.Net as in C#. However, understanding how it works allows you to make better programming decisions - when to declare variables, when objects are disposed, In that context, I think your question could be framed as "are there any good sources for telling me how to code efficiently and for a smaller memory footprint", OR "Can someone tell me why this weird stuff is happening". Both questions can be answered by giving a fuller understanding of how .Net manages memory, scope, etc. There are tons of resources to answer this,

That said, this first link has a lot of other links that would be useful to you:

http://geekswithblogs.net/sdorman/archive/2008/09/14/.net-memory-management-ndash-resources.aspx

And this second one is more to the point:

http://www.c-sharpcorner.com/UploadFile/tkagarwal/MemoryManagementInNet11232005064832AM/MemoryManagementInNet.aspx

David Stratton
+1  A: 

This is not an answer to your general question, but you can send a DataTable directly to a web service without the intermediate step of first writing it to XML. Actually, you can't send a DataTable, but you can send a DataSet (because DataSet is serializable while DataTable isn't), so you can send a DataTable directly by first wrapping it in a DataSet and then sending the DataSet. The SOAP protocol converts the DataSet to XML anyway, so you're not really gaining anything by converting the DataTable to XML yourself.

I'm guessing from your question that your DataTables are too large to send all at once, or else you're breaking them into smaller chunks so that your client application can indicate progress to the user. This can also be done without writing the contents to XML yourself.

Regarding your general question, it is not surprising that sometimes your memory consumption grows 20mb when reading (and sending) a 7mb file. The XML used to describe a DataTable and its contents (whether you're doing it yourself or it's being serialized automagically when you send it to a web service directly) is very verbose.

Your most efficient approach to this problem would be to send the client's data files directly to the web service (either as a single byte[] array or as a series of byte[] arrays), and then process these files entirely on the server. This approach will minimize the length of time required to send each file to the server (because sending 7mb takes less time than sending 20mb or even more).

MusiGenesis
I am using dataset.writeXML to serialize the data to XML.My real problem is that even if I comment out the writeXML part of the code, the data is not released after simply parsing the binary data file.I do not have an issue with the data transfer time (not mission critical information), and do not want to introduce the cron jobs on the server as this data on the client machine changes over time and it is beneficial at the time of loading that the client machine keeps a log of how many records were sent as well as some data massaging that is done on the client based on credentials, etc.
Paul
Edit: I meant to say ..."the memory is not released" (not the data)
Paul
Edit: Above, basically what I am saying is that I have to remove some distinct records from the files before sending them to the server, so I need to parse regardless and the size of the XML is not really the issue (as this should just be a temporary bump in memory). I just want to figure out how to release the memory after the XML file has been sent to server and the program is just waiting to parse the next file. I am not experienced enough in Win Apps to figure/trace where this memory is being stored.
Paul
@Paul: to clarify your question, are you actually experiencing shortage of memory issues, or are you just worried about this because of what you're seeing in Mem Usage? You can call Dispose on the DataSet, but I'm not sure it's even necessary if you're allowing the DataSet to go out of scope after you're done with it. Also, you could call "GC.Collect();" - this might result in an immediate drop in Mem Usage. In general, however, I wouldn't worry about this at all unless you start to run into apparent, demonstrable problems.
MusiGenesis
@Paul: sorry about the lecture in my answer - your reasons for doing things client-side are perfectly valid. However, I would just send the DataSet directly to the web service instead of first writing to an XML file, unless you already have extensive code in the web service that expects XML.
MusiGenesis
@Paul: as long as you're calling .Close() and .Dispose on your FileStream object (I assume that's what you mean by "File handle"), you shouldn't be having any running-out-of-memory problems.
MusiGenesis
I am not using the application in production, however, the issue that I am having is that if I leave the app running (i.e. going through every hour a list of 10 files), the mem usage goes up further and further. I have not had a crash, however I have a problem in that when the app is processing, I have trouble moving it out of the way to use other applications, however when I have similary size files openined in notepad++ or Textpad or msword, moving the application does not cause latency and dragging issues. I do not know how to categorize this behavior, however I am assuming it is memory
Paul
edit: and by the way, I appreciate any help, in any format (lecture format is fine with me)....thanks for even trying to help me.
Paul
@Paul: the problem you're having with not being able to move the app around while it's processing is because the code is executing on the same thread as the UI. The best way around this is to do the processing with a BackgroundWorker: http://msdn.microsoft.com/en-us/library/system.componentmodel.backgroundworker.aspx
MusiGenesis
@Paul: regarding the memory use going up and up without ever going down does suggest that maybe you have a problem with something not being released or disposed, but I'm not sure. I'd go with whatever Scott Dorman says about it.
MusiGenesis
@MusiGenesis: This is very helpful information, thank you.
Paul
A: 

The best book to get on the subject that I've read is Jeff Richter's book, CLR via C#:

http://www.amazon.com/CLR-via-Second-Pro-Developer/dp/0735621632/ref=sr%5F1%5F1?ie=UTF8&qid=1252853101&sr=8-1-spell

If you want a VB.NET version, they have that for the first edition of the book, but I don't think there was enough interest to translate the book into VB.NET for the second version. If you want to really learn .NET, you should get comfortable with C#. In both languages, memory is managed by the CLR.

Dave Markle
+3  A: 

First, you need to realize that Task Manager is showing you the amount of memory the operating system has allocated to your application. This is not necessarily the amount of memory actually being used. When a .NET application first starts, the operating system allocates memory for it, just as it does for any process. The .NET runtime then further divides that memory and manages how it is used. The runtime can be thought of as "greedy" in that once allocated memory by the operating system it won't give it back unless specifically asked to by the operating system. The result is that the memory usage in Task Manager is not accurate.

To get a true picture of your memory usage, you need to use Performance Monitor and add the appropriate counters.

As far as IDisposable and the dispose pattern, you probably won't find much that talks about this in language specific terms since it is something provided by the .NET Framework itself and is language agnostic. The pattern is the same no matter what language you use, only the syntax is different.

There are several references available that will give you information on how memory management works. I have two blog posts, one which talks about Using Garbage Collection in .NET and one which lists the various resources I used to create two presentations on memory management in .NET.

The best "rule of thumb" is that if a class implements IDisposable, it does so for a reason and you should ensure that you are calling Dispose() when you are done using the instance. This is most easily accomplished with the using statement.

Scott Dorman
@Scott, I am reading through these Docs, and appreciate the pointers. I will try to use "Using" again (I had tried this in the past). Please see my comments below to MusiGenesis regarding further clarification of my issue (however I will also update my original post to clarify)
Paul
@Paul: Without knowing more details of what you are doing in code it's difficult to help narrow down where you might be encountering an issue. Make sure you are disposing of any disposable objects as soon as you are done with them. Also, look for things like string concatenation inside of loops. You don't specify what version of the .NET Framework you are using but if it's 3.0 or later you might want to look at using the binary serializer to send the datasets across the wire to the webservice, if possible.
Scott Dorman
+2  A: 

If i were you I would fist of all make use of a profiler to see exactly what the application is doing. There are several - JetBrains, RedGate, YourKit. From there you can see exactly where the memory is not being released.

Then you can see where exactly you need to concentrate to correct the issue

Conrad