views:

588

answers:

2

Hi there.

I have this method,

var
s : TStringList;
fVar : string;
begin
s := TStringList.Create;
fVar := ZCompressStr('text');

ShowMessage( IntToStr(length(fVar) * SizeOf(Char)) );
//24

s.text := fVar;  

ShowMessage( IntToStr( length(s.text) * SizeOf(Char)) );
//18
end;

The ZCompressStr is from http://www.base2ti.com/zlib.htm with Line 121 changed from {$ifndef UNICODE} to {$ifdef UNICODE} to make it compile.

Anyway, I can call ZDecompressStr if I use the fVar variable, however once I move it to a stringlist or to a memo it seems to loose those 6 bytes of data.... If I try and use ZDecompressStr on the s.text var it fails with buffer error.

+2  A: 

This may be the conversion - TStringList.Text property is a property, not variable. You are using it in a bit dangerous way, since there is some processing of Text inside TStringList.

smok1
I have been downwoted. It would be nice to know what was actually wrong with my answer?
smok1
+12  A: 

There is no reason you should have had to change line 121 of ZLibEx.pas; it is correct for all versions of Delphi, including Delphi 2009. The UNICODE symbol should only be defined for Delphi 2009, and when it is, the type definitions for RawByteString, UnicodeString, and UnicodeChar should all be skipped because they're already intrinsic types in the language.

ZCompressStr will generate a string that may contain non-printable characters, including null bytes. It stores its result in a RawByteString, which Delphi treats specially.

TStringList, like just about everything else in Delphi 2009, uses Unicode. Its Text property is of type UnicodeString. When you assign any non-UnicodeString value to UnicodeString, you get a conversion, as from the MultiByteToWideStr API function. Even RawByteString is included in that rule. If you haven't assigned a code-page-specific string value to a RawByteString, then it will have code page 0, which is CP_ACP, the default code page for your system.

If the string doesn't really contain characters encoded according to the system code page, then any conversion is asking for trouble: garbage in, garbage out. In particular, there's no guarantee that you'll get the same number of characters.

As Smok1 mentioned, TStringList.Text is a property. It has a setter method that splits the given string into separate lines. When you read the property, it re-joins all those lines into a single string again. While setting the property, TStrings.SetTextStr (in Classes.pas, if you're curious) will split the line at any occurrence of #0, #10, or #13. That is, null characters, line feeds, and carriage returns. When re-joining all the lines, it will use its LineBreak property, which is initialized with the global sLineBreak variable. A line break is also put after the last string, so every line ends with LineBreak. Therefore, the conversion won't necessarily round-trip.

So, there are two things to learn from this:

  1. Don't treat compressed data as text.
  2. Don't use TStrings descendants to hold things that you don't want to treat a multiple strings.

Another good piece of advice: Don't use string as a generic data-storage type. Only use it for actual text. For storage of arbitrary binary data, prefer TBytes, or a TMemoryStream. Using your example, you could compress a string like this:

var
  ss: TStream;
  ms: TMemoryStream;
begin
  ss := TStringStream.Create('text');
  try
    ms := TMemoryStream.Create;
    try
      ShowMessage(IntToStr(ss.Size));
      ZCompressStream(ss, ms);
      ShowMessage(IntToStr(ms.Size));
    finally
      ms.Free;
    end;
  finally
    ss.Free;
  end;
end;
Rob Kennedy
Good point. In fact TStringList is quit clever so no matter if DOS-style or unix-style end-of-line is being used.
smok1
Thanks Rob this is a fantastic answer. Your first paragraph - yes it is strange but have you downloaded and tried to use that ZlibEx for their website it simply wouldn't compile until I changed that line, see also https://forums.embarcadero.com/thread.jspa?messageID=130234The reset of the answer is absolutely spot on. I have been using TStrlingList because I want to savetofile (which TMemoryStream can do) but I am going to want to POST this data via a form over the internet. I will write this bit up and modify my question a bit. Thanks!
Wizzard