views:

119

answers:

3

I'm trying to run AnsiStrings.StringReplace on a RawByteString holding a blob of data, some of which needs to be replaced. It would work, except that inside StringReplace it converts my string to a PAnsiChar, and so the search ends up bailing out as soon as it hits the first #0 byte inside the blob.

I'm looking for a routine that works just like StringReplace, but is safe to use on blobs that may contain null bytes. Does anyone know of one?

A: 

Hmm. Seems like it couldn't be too hard to write your own. Just iterate through the buffer until you find a match on the first byte. Then see if the subsequent bytes match. If so, you found it, now replace. Keep going or quit, depending on what you need. Obviously simpler if the sizes are the same. If not, then you can set up a second buffer and copy bytes from the base buffer into the new buffer.

Chris Thornton
Sure. I'd just prefer to see a tested third-party solution that's got the edge cases already worked out, if one exists.
Mason Wheeler
+1  A: 

I have not performed extensive testing, but I think that this code works.

type
  TDynByteArray = packed array of byte;

procedure BufReplace(var BufStart: PByte; var BufLen: cardinal; const Find: TDynByteArray; const Replace: TDynByteArray);
var
  pos: PByte;
  BufEnd: PByte;
  i: Integer;
  Match: boolean;
begin
  {$POINTERMATH ON}
  if Find = nil then Exit;
  pos := BufStart;
  BufEnd := BufStart + BufLen;
  while pos < BufEnd do
  begin
    Match := false;
    if pos^ = Find[0] then
      if pos + length(Find) < BufEnd then
      begin
        Match := true;
        for i := 1 to high(Find) do
          if PByte(pos + i)^ <> Find[i] then
          begin
            Match := false;
            break;
          end;
      end;
      if Match then
      begin
        if length(Find) = length(Replace) then
          Move(Replace[0], pos^, length(Replace))
        else
        begin
          if length(Replace) < length(Find) then
          begin
            Move(Replace[0], pos^, length(Replace));
            MoveMemory(pos + length(Replace), pos + length(Find), BufEnd - pos - length(Find));
            dec(BufLen, length(Find) - length(Replace));
            ReallocMem(BufStart, BufLen);
          end
          else
          begin
            inc(BufLen, length(Replace) - length(Find));
            ReallocMem(BufStart, BufLen);
            MoveMemory(pos + length(Replace), pos + length(Find), BufEnd - pos - length(Find));
            Move(Replace[0], pos^, length(Replace))
          end;
        end;
        inc(pos, length(Replace));
      end
      else
        inc(pos);
  end;
end;

To test it:

procedure TestIt;
var
  len: cardinal;
  a, b: TDynByteArray;
begin
  len := 16;
  GetMem(buf, len);
  FillChar(buf^, 16, $11);
  PByte(buf + 3)^ := $55;


  SetLength(a, 2);
  a[0] := $55;
  a[1] := $11;
  SetLength(b, 1);
  b[0] := $77;

  BufReplace(buf, len, a, b);
end;
Andreas Rejbrand
Well, one possible (and rather important) optimization in the case `length(Replace) > length(Find)` is to remove the need to reallocate memory at each occurrence of Find. Rather one should allocate a large block to begin with, and then keep track of the actual ending point, and then at the end truncate the block. (Well, what if the initial block is not big enough? Then increase the memory in large blocks on demand.)
Andreas Rejbrand
In addition, if `length(Replace) <> length(Find)` and `BufLen` is large (but not too large), then it might be better not to do an in-place replace, but to create a new buffer.
Andreas Rejbrand
+1  A: 

I'd guess the "Offending" function in StringReplace is AnsiPos->AnsiStrPos

So... I guess short of an already working solution, I'd copy/paste the StringReplace code and change AnsiPos for something else. (i.e. AnsiStrings.PosEx)

function RawByteStringReplace(const S, OldPattern, NewPattern: AnsiString;
  Flags: TReplaceFlags): AnsiString;
var
  SearchStr, Patt, NewStr: AnsiString;
  Offset: Integer;
begin
  //Removed the uppercase part...
  SearchStr := S;
  Patt := OldPattern;

  NewStr := S;
  Result := '';
  while SearchStr <> '' do
  begin
    Offset := AnsiStrings.PosEx(Patt, SearchStr);
    if Offset = 0 then
    begin
      Result := Result + NewStr;
      Break;
    end;
    Result := Result + Copy(NewStr, 1, Offset - 1) + NewPattern;
    NewStr := Copy(NewStr, Offset + Length(OldPattern), MaxInt);
    if not (rfReplaceAll in Flags) then
    begin
      Result := Result + NewStr;
      Break;
    end;
    SearchStr := Copy(SearchStr, Offset + Length(Patt), MaxInt);
  end;
end;
Ken Bourassa