views:

1943

answers:

5

Hello all,

I'm trying to find a Delphi function that will split an input string into an array of strings based on a delimiter. I've found a lot on Google, but all seem to have their own issues and I haven't been able to get any of them to work.

I just need to split a string like: "word:doc,txt,docx" into an array based on ':'. The result would be ['word', 'doc,txt,docx'].

Does anyone have a function that they know works?

Thank you

+1  A: 

Here is an implementation of an explode function which is available in many other programming languages as a standard function:

type 
  TStringDynArray = array of String;

function Explode(const Separator, S: string; Limit: Integer = 0): TStringDynArray; 
var 
  SepLen: Integer; 
  F, P: PChar; 
  ALen, Index: Integer; 
begin 
  SetLength(Result, 0); 
  if (S = '') or (Limit < 0) then Exit; 
  if Separator = '' then 
  begin 
    SetLength(Result, 1); 
    Result[0] := S; 
    Exit; 
  end; 
  SepLen := Length(Separator); 
  ALen := Limit; 
  SetLength(Result, ALen); 

  Index := 0; 
  P := PChar(S); 
  while P^ <> #0 do 
  begin 
    F := P; 
    P := AnsiStrPos(P, PChar(Separator)); 
    if (P = nil) or ((Limit > 0) and (Index = Limit - 1)) then P := StrEnd(F); 
    if Index >= ALen then 
    begin 
      Inc(ALen, 5); 
      SetLength(Result, ALen); 
    end; 
    SetString(Result[Index], F, P - F); 
    Inc(Index); 
    if P^ <> #0 then Inc(P, SepLen); 
  end; 
  if Index < ALen then SetLength(Result, Index); 
end; 

Sample usage:

var
  res: TStringDynArray;
begin
  res := Explode(':', yourString);
Mef
There are some strange and potentially hugely inefficient choices in this code for managing/anticipating the length of result. By growing the result array incrementally, the chances of memory re-allocations and fragmentation are increased. More efficient would be to set an initial length as large as it might possibly be i.e. assume that the input string consists of 50% separator strings = Length(S) div (2 * Length(Separator). Then set it to the actual number of items when done. 1 allocation followed potentially by a single truncation.
Deltics
Also you don't explain the purpose of the Limit parameter. I intuitively expected it to set a maximum number of substrings to be returned when in fact it appears to constrain the detection of substrings to the first "Limit" # of characters in the input string. This seems pointless since if you needed to do that you could simply operate Explode() over a Copy() of the required substring. Using Limit to set a maximum number of substrings would be far more useful.
Deltics
@Deltics: Nobody claimed that this is a highly optimized function, and nobody asked for one, so I somewhat don't understand your complaint. But maybe you are one of the guys who optimize everything, regardless if it is necessary or not...
Mef
@Mef: I'm the kind of guy that doesn't write needlessly inefficient code then worry about optimising later. This wasn't a case of analysing the code minutely and finding some miniscule optimisation potential, it was simply an obvious and easily addressed inefficiency: Incremental growth of contiguous memory that can instead easily be pre-allocated and subsequently truncated.
Deltics
Also @Mef: And it wasn't a complaint, it was a comment, an observation. But more importantly your code also contained what I would consider a bug (see my alternative for an explanation).
Deltics
+9  A: 

you can use the TStrings.DelimitedText property for split an string

check this sample

program Project28;

{$APPTYPE CONSOLE}

uses
  Classes,
  SysUtils;

procedure Split(Delimiter: Char; Str: string; ListOfStrings: TStrings) ;
begin
   ListOfStrings.Clear;
   ListOfStrings.Delimiter     := Delimiter;
   ListOfStrings.DelimitedText := Str;
end;


var
   OutPutList: TStringList;
begin
   OutPutList := TStringList.Create;
   try
     Split(':', 'word:doc,txt,docx', OutPutList) ;
     Writeln(OutPutList.Text);
     Readln;
   finally
     OutPutList.Free;
   end;
end.

UPDATE

you can see this link if you experience problems with the DelimitedText property.

RRUZ
Unfortunately there is a bug in many "older" Delphi versions (not sure with which release this got fixed) which has the effect that the space character is **always** used as delimiter. So handle this with care!!
Mef
Yeah. You'll want to set StrictDelimiter to true, and if the StrictDelimiter property is not available in your version of Delphi, don't use this technique! But if it is, then this is very useful.
Mason Wheeler
It wasn't a bug, it was an (annoying) design decision way back in D1 or D2. CommaText was supposed to enclose any fields with spaces with quotes. If the input has double quotes around any fields with spaces, the result is correct.
Gerry
One of my pet peeves is when people needlessly put type indicators in variable/parameter names. Pascal is strongly typed - it's redundant typing (of the finger exercise variety) and confusingly misleading when the type indicator is wrong, as in this case: ArrayOfStrings *isn't* an array (and as such doesn't even answer the question as posed).
Deltics
@Deltics, you are right the name chosen for the variable is very bad, I've changed. thanks for your observation.
RRUZ
For everyone upvoting this answer, please note that it doesn't yield an array, as specified in the question. Incomplete requirements specification is a big problem in this industry, ignoring stated requirements and delivering something not asked for is another big problem. Approving of either simply encourages bad practice. ;)
Deltics
+1  A: 

Similar to the Explode() function offered by Mef, but with a couple of differences (one of which I consider a bug fix):

  type
    TArrayOfString = array of String;


  function SplitString(const aSeparator, aString: String; aMax: Integer = 0): TArrayOfString;
  var
    i, strt, cnt: Integer;
    sepLen: Integer;

    procedure AddString(aEnd: Integer = -1);
    var
      endPos: Integer;
    begin
      if (aEnd = -1) then
        endPos := i
      else
        endPos := aEnd + 1;

      if (strt < endPos) then
        result[cnt] := Copy(aString, strt, endPos - strt)
      else
        result[cnt] := '';

      Inc(cnt);
    end;

  begin
    if (aString = '') or (aMax < 0) then
    begin
      SetLength(result, 0);
      EXIT;
    end;

    if (aSeparator = '') then
    begin
      SetLength(result, 1);
      result[0] := aString;
      EXIT;
    end;

    sepLen := Length(aSeparator);
    SetLength(result, (Length(aString) div sepLen) + 1);

    i     := 1;
    strt  := i;
    cnt   := 0;
    while (i <= (Length(aString)- sepLen + 1)) do
    begin
      if (aString[i] = aSeparator[1]) then
        if (Copy(aString, i, sepLen) = aSeparator) then
        begin
          AddString;

          if (cnt = aMax) then
          begin
            SetLength(result, cnt);
            EXIT;
          end;

          Inc(i, sepLen - 1);
          strt := i + 1;
        end;

      Inc(i);
    end;

    AddString(Length(aString));

    SetLength(result, cnt);
  end;

Differences:

  1. aMax parameter limits the number of strings to be returned
  2. If the input string is terminated by a separator then a nominal "empty" final string is deemed to exist

Examples:

SplitString(':', 'abc') returns      :    result[0]  = abc

SplitString(':', 'a:b:c:') returns   :    result[0]  = a
                                          result[1]  = b
                                          result[2]  = c
                                          result[3]  = <empty string>

SplitString(':', 'a:b:c:', 2) returns:    result[0]  = a
                                          result[1]  = b

It is the trailing separator and notional "empty final element" that I consider the bug fix.

I also incorporated the memory allocation change I suggested, with refinement (I mistakenly suggested the input string might at most contain 50% separators, but it could conceivably of course consist of 100% separator strings, yielding an array of empty elements!)

Deltics
+1 for the very thorough and precise work.
Runner
A: 

StrUtils.SplitString in Delphi 2010

alex
Hmmm, not in my version of Delphi 2010 (there is a SplitString routine in XMLDoc and in (Indy unit) IdStrings, but neither of these do what the poster wants and the XMLDoc routine isn't exposed through the unit interface anyway).
Deltics
function SplitString(const S, Delimiters: string): TStringDynArray; defined in StrUtils.pas
alex
+2  A: 

I always use something similar to this:

Uses
   StrUtils, Classes;

Var
  Str, Delimiter : String;
begin
  // Str is the input string, Delimiter is the delimiter
  With TStringList.Create Do
  try
    Text := ReplaceText(S,Delim,#13#10);

    // From here on and until "finally", your desired result strings are
    // in strings[0].. strings[Count-1)

  finally
    Free; //Clean everything up, and liberate your memory ;-)
  end;

end;
Frank