ansaurus

Question

c# MemoryStream

Answer 1

A:

You can create a memoryStream and then pass the array in line by line using the method Write

EDIT: The limit of a MemoryStream is certainly the amount of memory present for your application. Maybe there is a limit beneath that but if you need more memory, then you should consider to modify your overall architecture. E.g. you could process your data in chunks, or you could do a swapping mechanism to a file.

schoetbi 2010-09-06 12:42:34

YEs - I do that. BUT I believe that the MemoryStream still has a maximum limit to how much you can write to it - which is same as the max size of a bytearray...

ManInMoon 2010-09-06 12:47:29

Answer 2

+1 A:

I understand your problem but why are your trying to allocate such a big array ?? I have no doubt that even if you succeed in solving this technical issue your architecture is wrong and the program your are trying to develop won't eventually work.

Gilad 2010-09-06 12:51:59

Gilad, I need to allocate a big array to be able to process it in-memory. Speed is the critical issue to me. Plus I have many threads all processing the same data at once. I understand your concern BUT the archtecture works already with data up to the arbitary limit that is the maximum size of a bytearray. If I could remove this limit (or fool c#) then I see no reason why it should not continue to work.

ManInMoon 2010-09-06 12:58:43

Answer 3

+1 A:

Agree. Anyway you have limit of array size itself.

If you really need to operate huge arrays in a stream, write your custom memory stream class.

Dmitry Karpezo 2010-09-06 12:57:44

Yes _ I have considered that - but that gives me a separate issue when trying to read across bytearray boundaries which wrting a custom stream class would involve - I have another question open on that one.

ManInMoon 2010-09-06 13:04:03

Answer 4

+4 A:

I do not believe .NET provides this, but it should be fairly easy to implement your own implementation of System.IO.Stream, that seamlessly switches backing array. Here are the (untested) basics:

public class MultiArrayMemoryStream: System.IO.Stream
{
    byte[][] _arrays;
    long _position;
    int _arrayNumber;
    int _posInArray;

    public MultiArrayMemoryStream(byte[][] arrays){
        _arrays = arrays;
        _position = 0;
        _arrayNumber = 0;
        _posInArray = 0;
    }

    public override int Read(byte[] buffer, int offset, int count){
        int read = 0;
        while(read<count){
            if(_arrayNumber>=_arrays.Length){
                return read;
            }
            if(count-read <= _arrays[_arrayNumber].Length - _posInArray){
                Buffer.BlockCopy(_arrays[_arrayNumber], _posInArray, buffer, offset+read, count-read);
                _posInArray+=count-read;
                            _position+=count-read;
                read=count;
            }else{
                Buffer.BlockCopy(_arrays[_arrayNumber], _posInArray, buffer, offset+read, _arrays[_arrayNumber].Length - _posInArray);
                read+=_arrays[_arrayNumber].Length - _posInArray;
                            _position+=_arrays[_arrayNumber].Length - _posInArray;
                _arrayNumber++;
                _posInArray=0;
            }
        }
        return count;
    }

    public override long Length{
        get {
            long res = 0;
            for(int i=0;i<_arrays.Length;i++){
                res+=_arrays[i].Length;
            }
            return res;
        }
    }

    public override long Position{
        get { return _position; }
        set { throw new NotSupportedException(); }
    }

    public override bool CanRead{
        get { return true; }
    }

    public override bool CanSeek{
        get { return false; }
    }

    public override bool CanWrite{
        get { return false; }
    }

    public override void Flush(){
    }

    public override void Seek(long offset, SeekOrigin origin){
        throw new NotSupportedException();
    }

    public override void SetLength(long value){
        throw new NotSupportedException();
    }

    public override void Write(byte[] buffer, int offset, int count){
        throw new NotSupportedException();
    }       
}

Another way to workaround the size-limitation of 2^31 bytes is UnmanagedMemoryStream which implements System.IO.Stream on top of an unmanaged memory buffer (which might be as large as the OS supports). Something like this might work (untested):

var fileStream = new FileStream("data", 
  FileMode.Open, 
  FileAccess.Read, 
  FileShare.Read, 
  16 * 1024, 
  FileOptions.SequentialScan);
long length = fileStream.Length;
IntPtr buffer = Marshal.AllocHGlobal(new IntPtr(length));
var memoryStream = new UnmanagedMemoryStream((byte*) buffer.ToPointer(), length, length, FileAccess.ReadWrite);
fileStream.CopyTo(memoryStream);
memoryStream.Seek(0, SeekOrigin.Begin);
// work with the UnmanagedMemoryStream
Marshal.FreeHGlobal(buffer);

Rasmus Faber 2010-09-06 12:58:00

Rasmus - I hadn't heard of that - I will look it up - thank you

ManInMoon 2010-09-06 13:04:59

Rasmus - that looks interesting.

ManInMoon 2010-09-06 13:07:59

Could you guide me as to how best to load a bytestream from disk into UnmanagedMemoryStream in the example?

ManInMoon 2010-09-06 13:08:58

@ManInMoon: Try this.

Rasmus Faber 2010-09-06 13:19:35

Thanks Rasmas - I am working through what you have given me here

ManInMoon 2010-09-06 14:01:04

Rasmus - I can't make this work. There's practically no doc or examples of using SafeBuffer and I get a "cannot create an instance of ... SafeBuffer".

ManInMoon 2010-09-06 14:36:15

Any further clues? Appreciate your help with this.

ManInMoon 2010-09-06 14:36:36

@ManInMoon: I did warn that it was untested ;-) Apparently SafeBuffer is abstract, so you cannot use that directly. Instead just allocate the memory directly as seen in the edited answer.

Rasmus Faber 2010-09-06 16:02:43

Thanks Rasmus, but here we have a similar problem AllocHGlobal expects an integer. So the boundary is there again.

ManInMoon 2010-09-06 16:15:19

@ManInMoon: AllocHGlobal has an overload which accepts an IntPtr. This can be used to allocate more memory than can be held in an integer. As I write in the updated example above: `Marshal.AllocHGlobal(new IntPtr(fileStream.Length))`.

Rasmus Faber 2010-09-06 16:30:58

Answer 5

A:

I think you can use a linear structure instead of a 2D structure using the following approach.

Instead of having byte[int.MaxValue][10] you can have byte[int.MaxValue*10]. You would address the item at [4,5] as int.MaxValue*(4-1)+(5-1). (a general formula would be (i-1)*number of columns+(j-1).

Of course you could use the other convention.

DaeMoohn 2010-09-06 13:18:37

the reason why he's using the 2D structure is that he's going past the size limit of a single byte array.

Dave 2010-09-06 13:23:57

good point, int.MaxValue!

DaeMoohn 2010-09-06 15:48:01

Answer 6

A:

If I understand your question correctly, you've got a massive file that you want to read into memory and then process. But you can't do this because the amount of data in the file exceeds that of any single-dimensional array.

You mentioned that speed is important, and that you have multiple threads running in parallel to process the data as quickly as possible. If you're going to have to partition the data for each thread anyway, why not base the number of threads on the number of byte[int.MaxValue] buffers required to cover everything?

Dave 2010-09-06 13:23:06

Sorry, should have made that clear. Each of my threads runs over the whole data.

ManInMoon 2010-09-06 14:57:47

I see. So you're doing something like applying multiple filters on the data, and not using threads to process one set of data more quickly? Just trying to see if there's another approach for you to use to get around this memory limitation.

Dave 2010-09-06 15:03:40

Answer 7

A:

IF you are using Framework 4.0, you have the option of working with a MemoryMappedFile. Memory mapped files can be backed by a physical file, or by the windows swap file. Memory mapped files act like an in memory stream, transparently swapping data to/from the backing storage if and when required.

If you are not using Framework 4.0, you can still use this option, but you will need to either write your own or find an exsiting wrapper. I expect there are plenty on CodeProject.

Chris Taylor 2010-09-06 14:24:10

Thanks Chris. I tried that option but MemoryMappedFiles are very slow.

ManInMoon 2010-09-06 14:59:08

@ManInMoon, that is unfortunate. The performance hit is probably because of the transition from User space to Kernel space, since MemoryMappedFiles are Kernel objects.

Chris Taylor 2010-09-06 15:06:31

@ManInMoon, what is the source of the data? Is it being read from a file into memory?

Chris Taylor 2010-09-06 15:23:51

yes. as a bytestream that I then process serially

ManInMoon 2010-09-06 16:05:37

ansaurus

tags:

views:

answers:

c# MemoryStream

related questions