views:

119

answers:

4

I am using Delphi 2009 with Unicode strings.

I'm trying to Encode a very large file to convert it to Unicode:

var
  Buffer: TBytes;
  Value: string;

Value := Encoding.GetString(Buffer);

This works fine for a Buffer of 40 MB that gets doubled in size and returns Value as an 80 MB Unicode string.

When I try this with a 300 MB Buffer, it gives me an EOutOfMemory exception.

Well, that wasn't totally unexpected. But I decided to trace it through anyway.

It goes into the DynArraySetLength procedure in the System unit. In that procedure, it goes to the heap and calls ReallocMem. To my surprise, it successfully allocates 665,124,864 bytes!!!

But then towards the end of DynArraySetLength, it calls FillChar:

  // Set the new memory to all zero bits
  FillChar((PAnsiChar(p) + elSize * oldLength)^, elSize * (newLength - oldLength), 0);

You can see by the comment what that is supposed to do. There is not much to that routine, but that is the routine that causes the EOutOfMemory exception. Here is FillChar from the System unit:

procedure _FillChar(var Dest; count: Integer; Value: Char);
{$IFDEF PUREPASCAL}
var
  I: Integer;
  P: PAnsiChar;
begin
  P := PAnsiChar(@Dest);
  for I := count-1 downto 0 do
    P[I] := Value;
end;
{$ELSE}
asm                                  // Size = 153 Bytes
        CMP   EDX, 32
        MOV   CH, CL                 // Copy Value into both Bytes of CX
        JL    @@Small
        MOV   [EAX  ], CX            // Fill First 8 Bytes
        MOV   [EAX+2], CX
        MOV   [EAX+4], CX
        MOV   [EAX+6], CX
        SUB   EDX, 16
        FLD   QWORD PTR [EAX]
        FST   QWORD PTR [EAX+EDX]    // Fill Last 16 Bytes
        FST   QWORD PTR [EAX+EDX+8]
        MOV   ECX, EAX
        AND   ECX, 7                 // 8-Byte Align Writes
        SUB   ECX, 8
        SUB   EAX, ECX
        ADD   EDX, ECX
        ADD   EAX, EDX
        NEG   EDX
@@Loop:
        FST   QWORD PTR [EAX+EDX]    // Fill 16 Bytes per Loop
        FST   QWORD PTR [EAX+EDX+8]
        ADD   EDX, 16
        JL    @@Loop
        FFREE ST(0)
        FINCSTP
        RET
        NOP
        NOP
        NOP
@@Small:
        TEST  EDX, EDX
        JLE   @@Done
        MOV   [EAX+EDX-1], CL        // Fill Last Byte
        AND   EDX, -2                // No. of Words to Fill
        NEG   EDX
        LEA   EDX, [@@SmallFill + 60 + EDX * 2]
        JMP   EDX
        NOP                          // Align Jump Destinations
        NOP
@@SmallFill:
        MOV   [EAX+28], CX
        MOV   [EAX+26], CX
        MOV   [EAX+24], CX
        MOV   [EAX+22], CX
        MOV   [EAX+20], CX
        MOV   [EAX+18], CX
        MOV   [EAX+16], CX
        MOV   [EAX+14], CX
        MOV   [EAX+12], CX
        MOV   [EAX+10], CX
        MOV   [EAX+ 8], CX
        MOV   [EAX+ 6], CX
        MOV   [EAX+ 4], CX
        MOV   [EAX+ 2], CX
        MOV   [EAX   ], CX
        RET                          // DO NOT REMOVE - This is for Alignment
@@Done:
end;
{$ENDIF}

So my memory was allocated, but it crashed trying to fill it with zeros. This doesn't make sense to me. As far as I'm concerned, the memory doesn't even need to be filled with zeros - and that is probably a time waster anyhow - since the Encoding statement is about to fill it anyway.

Can I somehow prevent Delphi from doing the memory fill?

Or is there some other way I can get Delphi to allocate this memory successfully for me?

My real goal is to do that Encoding statement for my very large file, so any solution that will allow this would be much appreciated.


Conclusion: See my comments on the answers.

This is a warning to be careful in debugging assembler code. Make sure you break on all the "RET" lines, since I missed the one in the middle of the FillChar routine and erroneously concluded that FillChar caused the problem. Thanks Mason, for pointing this out.

I will have to break the input into Chunks to handle the very large file.

+5  A: 

Read a chunk from the file, encode and write to another file, repeat.

Romain Hippeau
@Romain: I originally had code to do that. But it was tricky at the boundary where you break it up because you might split a multi-byte input character. Also, the Encoding routine is so darned fast, it's a shame not to do it all at once.
lkessler
@Ikessler - sometimes you have to go with either a time or space compromise. Performance should not be that bad if you read in 4k at a time or more.
Romain Hippeau
...or even 40 MB at a time, since you seem to be able to handle that.
Mason Wheeler
The thing to do is to make sure it works with chunks of 100 bytes at a time, which makes debugging easy and you test the boundary conditions, and then set it to something really large (perhaps dynamically) for the production code.
mj2008
I wouldn't read "chunks", I would use a Stream. A fast unicode streamwith a readline, should be much faster than 300 mb of vm.
Warren P
@Warren P A Stream with a readline is a chunk.
Romain Hippeau
+6  A: 

FillChar isn't allocating any memory, so that's not your problem. Try tracing into it and placing breakpoints at the RET statements, and you'll see that the FillChar finishes. Whatever the problem is, it's probably in a later step.

Mason Wheeler
@Mason: Thanks for this. Yes you are correct. The RET statement in the middle of the FillChar routine is where it leaves from, so my break I had at the end of the routine didn't catch it. It does then get to MemoryManager.GetMem and signals the OutOfMemory error. I'll have to split the Encoding into chunks like @Romain says. You helped me out, but Romain answered my question, so I'll have to give him the accepted answer.
lkessler
+1 for helping him out
Romain Hippeau
+1  A: 

A wild guess: Could the problem be memory being overcommitted and when the FillChar actually accesses the memory it can't find a page to actually give you? I don't know if Windows will even overcommit memory, I do know that some OSes do--you don't find out about it until you actually try to make use of the memory.

If this is the case it could cause the blowup in FillChar.

Loren Pechtel
@Loren: Thanks for the response, but FillChar wasn't the problem after all, as @Mason was correct in pointing out.
lkessler
+1  A: 

Programs are great at looping. They loop tirelessly without complaining.

Allocating a huge amount of memory takes time. There will be many calls to the heap manager. Your OS won't even know if it has the amount of contiguous memory that you need ahead of time. Your OS says, yeah, I have 1 GB free. But as soon as you go to use it, your OS says, wait, you want all of it in one chunk? Let me make sure I have enough all in one place. If it doesn't you get the error.

If it does have the memory, well, there's still a lot of work for the heap manager in preparing the memory and marking it as used.

So, obviously, it makes some sense to allocate less memory and simply loop through it. This saves the computer from doing a lot of work that it will only have to undo when it's done. Why not have it do just a little bit of work in setting aside your memory, then just keep re-using it?

Stack memory is allocated much faster than heap memory. If you keep your memory usage small (under 1 MB, by default), the compiler may just use stack memory over heap memory, which will make your loops even faster. In addition, local variables that get allocated in the register are very fast.

There are factors such as hard drive cluster and cache sizes, CPU cache sizes, and things, that offer hints about the best chunk sizes. The key is to find a good number. I like to use 64 KB chunks.

Marcus Adams
@Marcus: That's a good comment. I'll try using both 40 MB and 1 MB as blocking sizes and test to see whether more stack allocations is faster than fewer heap allocations.
lkessler
The idea is to keep the memory allocated while you use it, but allocated on the stack. If you call a function repeatedly, which allocates the memory on the stack then frees it, you're still doing extra work. Loop with a for or while loop inside a function to reuse the memory.
Marcus Adams