views:

1313

answers:

5

Hello everyone,

I am using VSTS2008 + C# + .Net 3.5 to run this console application on x64 Server 2003 Enterprise with 12G physical memory.

Here is my code, and I find when executing statement bformatter.Serialize(stream, table), there is out of memory exception. I monitored memory usage through Perormance Tab of Task Manager and I find only 2G physical memory is used when exception is thrown, so should be not out of memory. :-(

Any ideas what is wrong? Any limitation of .Net serialization?

    static DataTable MakeParentTable()
    {
        // Create a new DataTable.
        System.Data.DataTable table = new DataTable("ParentTable");
        // Declare variables for DataColumn and DataRow objects.
        DataColumn column;
        DataRow row;

        // Create new DataColumn, set DataType, 
        // ColumnName and add to DataTable.    
        column = new DataColumn();
        column.DataType = System.Type.GetType("System.Int32");
        column.ColumnName = "id";
        column.ReadOnly = true;
        column.Unique = true;
        // Add the Column to the DataColumnCollection.
        table.Columns.Add(column);

        // Create second column.
        column = new DataColumn();
        column.DataType = System.Type.GetType("System.String");
        column.ColumnName = "ParentItem";
        column.AutoIncrement = false;
        column.Caption = "ParentItem";
        column.ReadOnly = false;
        column.Unique = false;
        // Add the column to the table.
        table.Columns.Add(column);

        // Make the ID column the primary key column.
        DataColumn[] PrimaryKeyColumns = new DataColumn[1];
        PrimaryKeyColumns[0] = table.Columns["id"];
        table.PrimaryKey = PrimaryKeyColumns;

        // Create three new DataRow objects and add 
        // them to the DataTable
        for (int i = 0; i <= 5000000; i++)
        {
            row = table.NewRow();
            row["id"] = i;
            row["ParentItem"] = "ParentItem " + i;
            table.Rows.Add(row);
        }

        return table;
    }

    static void Main(string[] args)
    {
        DataTable table = MakeParentTable();
        Stream stream = new MemoryStream();
        BinaryFormatter bformatter = new BinaryFormatter();
        bformatter.Serialize(stream, table);   // out of memory exception here
        Console.WriteLine(table.Rows.Count);

        return;
    }

thanks in advance, George

+3  A: 

Re the out-of-memory / 2GB; individual .NET objects (such as the byte[] behind a MemoryStream) are limited to 2GB. Perhaps try writing to a FileStream instead?

(edit: nope: tried that, still errors)

I also wonder if you may get better results (in this case) using table.WriteXml(stream), perhaps with compression such as GZIP if space is a premium.

Marc Gravell
Hi Marc, thanks for your valuable reply. For byte[] limitation of 2G size, could you recommend me some more documents about this topic please? I never know this and want to learn more backgrounds.
George2
"edit: nope: tried that, still errors" -- what do you mean tried that, still errors? You mean which method still has errors?
George2
Marc, I have tried to use writexml and it works. But the result file size is only 520M. It is far less than 2G. How do you prove my original code hits the 2G limitation?
George2
Re last... I don't; I merely point out that even with a stonking amount of memory, some things remain capped in size. Arrays (which underpin other things like lists and MemoryStream) and strings being the most obvious. If you are hitting some *other* limitation of `BinaryFormatter`, then to be honest you're scuppered no matter whether it is system memory, list size, or simply karma. So if WriteXml works, stick with that ;-p
Marc Gravell
Re still errors: I mean that switching to a FileStream and using your code "as is" still has the problem; so it isn't necessarily *purely* limited to MemoryStream limitations; but note that since BinaryFormatter performs object tracking internally there are a lot of other things going on anyway.
Marc Gravell
Re 2GB; this might be in the CLI spec (ECMA335) - I can't be sure; but it is discussed more here: http://blogs.msdn.com/joshwil/archive/2005/08/10/450202.aspx
Marc Gravell
As for the filesize compared to memory. Part of the answer is that strings in memory in .Net are UTF-16, but UTF-8 when writing them (most likely). This means in your test that you will half the bytes for the string when writing them.
Mikael Svenson
Thanks Marc and Mikael, your reply is so great! I have a related question here, appreciate if you could help.http://stackoverflow.com/questions/1297797/windows-32-bit-virtual-memory-page-mapping-issue
George2
+1  A: 

1) The OS is x64, but is the app x64 (or anycpu)? If not, it is capped at 2Gb.

2) Does this happen 'early on', or after the app has been running for some time (i.e. n serializations later)? Could it maybe be a result of large object heap fragmentation...?

KristoferA - Huagati.com
Thanks KristoferA, for your concerns, 1. I build into Any CPU, so it should be able to consume more than 2G physical memory, correct? 2. The out of memory exception occurs after 1 minutes, after method MakeParentTable returns successful. So I think it should not be a result of large object heap fragmentation? Any comments?
George2
+2  A: 

Have a look here: Out of Memory Does Not Refer to Physical Memory.

Tim S. Van Haren
Thanks! I read this great document. I am wondering how do you think the root cause of out-of-memory of my issue, memory block fragment? Serialization stream must be using continuous memory?
George2
The serialization stream has to be contiguous memory if its backing field is a byte[]. Maybe you could implement your stream class that splits in chunks no larger than a couple hundred megabytes each.
Cecil Has a Name
Thanks tsvanharen, your reply is so great! I have a related question here, appreciate if you could help. http://stackoverflow.com/questions/1297797/windows-32-bit-virtual-memory-page-mapping-issue
George2
+1  A: 

Interestingly, it actually goes up to 3.7GB before giving a memory error here (Windows 7 x64). Apparently, it would need about double that amount to complete.

Given that the application uses 1.65GB after creating the table, it seems likely that it's hitting the 2GB byte[] (or any single object) limit Marc Gravell is speaking of (1.65GB + 2GB ~= 3.7GB)

Based on this blog, I suppose you could allocate your memory using the WINAPI, and write your own MemoryStream implementation using that. That is, if you really wanted to do this. Or write one using more than one array of course :)

Thorarin
Hi Thorarin, 1. for the 2G size limitation of byte[], I want to learn more. Do you have any more related documents? 2. The number 1.65GB and 2GB means whose footprint in your calculation?
George2
1.65GB for the DataTable itself. All I could find on the 2GB limit (other than that it exists) is http://blogs.msdn.com/joshwil/archive/2005/08/10/450202.aspx
Thorarin
Thanks Thorarin, 1. I understand 2GB limitation. But how do you prove the 2GB is actually reached? :-) 2. For the 1.65G how do you calculate such precise number? :-)
George2
Thanks Thorarin, your reply is so great! I have a related question here, appreciate if you could help.http://stackoverflow.com/questions/1297797/windows-32-bit-virtual-memory-page-mapping-issue
George2
+3  A: 

As already discussed this is a fundamental issue with trying to get contiguous blocks of memory in the Gigabyte sort of size.

You will be limited by (in increasing difficulty)

  1. The amount of addressable memory
  2. The CLR's limitation that no single object may consume more than 2GB of space.
  3. Finding a contiguous block within the available memory.

You can find that you run out of space before the CLR limit of 2 because the backing buffer in the stream is expanded in a 'doubling' fashion and this swiftly results in the buffer being allocated in the Large Object Heap. This heap is not compacted in the same way the other heaps are(1) and as a result the process of building up to the theoretical maximum size of the buffer under 2 fragments the LOH so that you fail to find a sufficiently large contiguous block before this happens.

Thus a mitigation approach if you are close to the limit is to set the initial capacity of the stream such that it definitely has sufficient space from the start via one of the constructors.

Given that you are writing to the memory stream as part of a serialization process it would make sense to actually use streams as intended and use only the data required.

  • If you are serializing to some file based location then stream it into that directly.
  • If this is data going into a Sql Server database consider using:
  • If you are serializing this in memory for use in say a comparison then consider streaming the data being compared as well and diffing as you go along.
  • If you are persisting an object in memory to recreate it latter then this really should be going to a file or a memory mapped file. In both cases the operating system is then free to structure it as best it can (in disk caches or pages being mapped in and out of main memory) and it is likely it will do a better job of this than most people are able to do themselves.
  • If you are doing this so that the data can be compressed then consider using streaming compression. Any block based compression stream can be fairly easily converted into a streaming mode with the addition of padding. If your compression API doesn't support this natively consider using one that does or writing the wrapper to do it.
  • If you are doing this to write to a byte buffer which is then pinned and passed to an unmanaged function then use the UnmanagedMemoryStream instead, this stands a slightly better chance of being able to allocate a buffer of this sort of size but is still not guaranteed to do so.

Perhaps if you tell us what you are serializing an object of this size for we might be able to tell you better ways to do it.


  1. This is an implementation detail you should not rely on
ShuggyCoUk
Thanks ShuggyCoUk, your reply is so great! I have a related question here, appreciate if you could help.http://stackoverflow.com/questions/1297797/windows-32-bit-virtual-memory-page-mapping-issue
George2