tags:

views:

1693

answers:

6

I am using StringReplace to replace &gt and &lt by the char itself in a generated XML like this:

StringReplace(xml.Text,'>','>',[rfReplaceAll]) ;
StringReplace(xml.Text,'&lt;','<',[rfReplaceAll]) ;

The thing is it takes way tooo long to replace every occurence of &gt.

Do you purpose any better idea to make it faster?

+1  A: 

The problem is that you are iterating the entire string size twice (one for replacing &gt; by > and another one to replace &lt; by <).

You should iterate with a for and simply check ahead whenever you find a & for a gt; or lt; and do the imediate replace and then skipping 3 characters ((g|l)t;). This way it can do that in proportional time to the size of the string xml.Text.


A simple C# example as I do not know Delphi but should do for you to get the general idea.

String s = "&lt;xml&gt;test&lt;/xml&gt;";
char[] input = s.ToCharArray();
char[] res = new char[s.Length];
int j = 0;
for (int i = 0, count = input.Length; i < count; ++i)
{
    if (input[i] == '&')
    {
        if (i < count - 3)
        {
            if (input[i + 1] == 'l' || input[i + 1] == 'g')
            {
                if (input[i + 2] == 't' && input[i + 3] == ';')
                {
                    res[j++] = input[i + 1] == 'l' ? '<' : '>';
                    i += 3;
                    continue;
                }
            }
        }
    }

    res[j++] = input[i];
}
Console.WriteLine(new string(res, 0, j));

This outputs:

<xml>test</xml>
smink
+1  A: 

Untested conversion of the C# code written by Jorge Ferreira.

function ReplaceLtGt(const s: string): string;
var
  inPtr, outPtr: integer;
begin
  SetLength(Result, Length(s));
  inPtr := 1;
  outPtr := 1;
  while inPtr <= Length(s) do begin
    if (s[inPtr] = '&') and ((inPtr + 3) <= Length(s)) and
       (s[inPtr+1] in ['l', 'g']) and (s[inPtr+2] = 't') and
       (s[inPtr+3] = ';') then
    begin
      if s[inPtr+1] = 'l' then
        Result[outPtr] :=  '<'
      else
        Result[outPtr] := '>';
      Inc(inPtr, 3);
    end
    else begin
      Result[outPtr] := Result[inPtr];
      Inc(inPtr);
    end;
    Inc(outPtr);
  end;
  SetLength(Result, outPtr - 1);
end;
gabr
Is this save for Unicode?
Ralph Rickenbach
No, since string is not unicode but ansi. It would be if you use WideString instead of string. And it would be in Delphi 2009 or Delphi for .Net since both use unicode string by default.
Lars Truijens
+3  A: 

Try FastStrings.pas from Peter Morris.

Neftalí
FastStrings are great but the unit doesn't work anymore in Delphi 2009, which could be a problem (if not now then when (if) doing an upgrade).
gabr
Faststrings is leaps and bounds faster than the normal Delphi StringReplace function. I really hope Peter releases a new version for Delphi 2009.
David Lambert
Be warned that FastString is not unicode and most xml is.
Lars Truijens
A: 

Systools (Turbopower, now open source) has a ReplaceStringAllL function that does all of them in a string.

mj2008
+3  A: 

If you're using Delphi 2009, this operation is about 3 times faster with TStringBuilder than with ReplaceString. It's Unicode safe, too.

I used the text from http://www.CodeGear.com with all occurrences of "<" and ">" changed to "&lt;" and "&gt;" as my starting point.

Including string assignments and creating/freeing objects, these took about 25ms and 75ms respectively on my system:

function TForm1.TestStringBuilder(const aString: string): string;
var
  sb: TStringBuilder;
begin
  StartTimer;
  sb := TStringBuilder.Create;
  sb.Append(aString);
  sb.Replace('&gt;', '>');
  sb.Replace('&lt;', '<');
  Result := sb.ToString();
  FreeAndNil(sb);
  StopTimer;
end;

function TForm1.TestStringReplace(const aString: string): string;
begin
  StartTimer;
  Result := aString;
  StringReplace(Result,'&gt;','>',[rfReplaceAll]) ;
  StringReplace(Result,'&lt;','<',[rfReplaceAll]) ;
  StopTimer;
end;
Bruce McGee
+3  A: 

You should definitely look at the Fastcode project pages: http://fastcode.sourceforge.net/

They ran a challenge for a faster StringReplace (Ansi StringReplace challenge), and the 'winner' was 14 times faster than the Delphi RTL.

Several of the fastcode functions have been included within Delphi itself in recent versions (D2007 on, I think), so the performance improvement may vary dramatically depending on which Delphi version you are using.

As mentioned before, you should really be looking at a Unicode-based solution if you're serious about processing XML.

Roddy