views:

259

answers:

5

I have two string lists that I'm working with. One that has a list of keywords, and then another that has a list of negative keywords. I want to be able to search through the list and pick out the list items that do not contain the negative keyword and output to a third keyword list. I was using the AnsiPos function but that found the negative keywords if they were part of a word, vs full word.

Any suggestions on a relatively easy way to do this? Speed is not that important, but would be nice.

Example of what I'm looking to do:

Keyword List:

Cat 
Catfish
Fish Sticks
Dog Food

Negative Keyword List:

Fish

Returned Values Wanted:

Cat 
Catfish
Dog Food

This is what I've got so far.. which does not work. I used information from: http://stackoverflow.com/questions/1678572/is-there-an-efficient-whole-word-search-function-in-delphi

function ExistWordInString(aString: PAnsichar; aSearchString: string;
  aSearchOptions: TStringSearchOptions): Boolean;
var
  Size : Integer;
begin
  Size := StrLen(aString);
  result := SearchBuf(aString, Size, 0, 0, aSearchString, aSearchOptions) <> nil;
end;

procedure TForm2.Button1Click(Sender: TObject);
var
  i, j, index: integer;
  s: string;
  stl: tstringlist;
begin
  stl := TStringList.Create;
  stl.Text := listbox1.Items.Text;
  for I := 0 to stl.Count - 1 do
  begin
    for j := 0 to listbox2.Count - 1 do
    begin
      if not ExistWordInString(PAnsiChar(listbox2.Items.Strings[j]),
        listbox1.Items.Strings[i], [soWholeWord, soDown])
      then
        listbox3.Items.Append(stl.Strings[i]);
    end;
  end;
end;
+1  A: 

I think I figured it out. Use stringlist.find('fish',index);

I didn't figure it out. .find did not work.

-Brad

Brad
ps. for performance, have a look at .Sorted property which helps .Find do its job much faster.
utku_karatas
This is not going to work. "Find" does not match partial strings. You will need some algorithm that performs a "whole word" matching comparison for each word in the negative list against the candidate list. Your definition of "whole word" may be very specific (i.e. what characters beyond "space" and the "beginning of string/end of string" constitute a word break for the purposes of matching "words".
Deltics
I just figured that out.. :(
Brad
As pointed by Jolyon, this is a WRONG answer. Please don't upvote.
François
Thanks for the help guys I'm going to try something like : http://stackoverflow.com/questions/1678572/is-there-an-efficient-whole-word-search-function-in-delphi unless you can think of a better way.
Brad
+2  A: 

If spaces are the only word delimiter you need to worry about, then you can do a whole word match using AnsiPos by adding a space before and after both the keyword and the negative keyword, ie

AnsiPos(' '+SubStr+' ', ' '+Str+' ')

You'd need a loop to check every entry from the negative keyword list.

Rob McDonell
That might work... seems like a bit of a hack though... but it would be easier than what I'm doing now.. :P
Brad
Didn't work... :( Still trying to figure this out...
Brad
+1  A: 

this sample code works like a charm (using Delphi 7):

unit Unit1;

interface

uses
  Windows, Messages, SysUtils, Classes, Graphics, Controls, Forms, Dialogs,
  StdCtrls, StrUtils;

type
  TForm1 = class(TForm)
  Button1: TButton;
  ListBox1: TListBox;
  ListBox2: TListBox;
  ListBox3: TListBox;
  procedure Button1Click(Sender: TObject);

  private
     function ExistWordInString(aString, aSearchString:string;aSearchOptions: TStringSearchOptions): Boolean;

  public
  end;

var
  Form1: TForm1;

implementation

{$R *.DFM}

procedure TForm1.Button1Click(Sender: TObject);
var
    i,k: integer;

begin

    for k:= 0 to ListBox2.Count -1 do
        for i:= 0 to ListBox1.Count - 1 do
        begin
            if not ExistWordInString(ListBox1.Items[i], ListBox2.Items[k],[soWholeWord,soDown]) then
                ListBox3.Items.Append(ListBox1.Items[i]);
        end;

end;

function TForm1.ExistWordInString(aString, aSearchString: string; aSearchOptions: TStringSearchOptions): Boolean;
var
  Size : Integer;

begin
        Size:=Length(aString);
        Result := SearchBuf(PChar(aString), Size, 0, 0, aSearchString, aSearchOptions)<>nil;

end;
end.    

and here's the form:

object Form1: TForm1
  Left = 1008
  Top = 398
  Width = 411
  Height = 294
  Caption = 'Form1'
  Color = clBtnFace
  Font.Charset = DEFAULT_CHARSET
  Font.Color = clWindowText
  Font.Height = -11
  Font.Name = 'Tahoma'
  Font.Style = []
  OldCreateOrder = False
  PixelsPerInch = 96
  TextHeight = 13
  object Button1: TButton
    Left = 320
    Top = 8
    Width = 75
    Height = 25
    Caption = 'Button1'
    TabOrder = 0
    OnClick = Button1Click
  end
  object ListBox1: TListBox
    Left = 8
    Top = 8
    Width = 177
    Height = 97
    ItemHeight = 13
    Items.Strings = (
      'Cat '
      'Catfish'
      'Fish Sticks'
      'Dog Food')
    TabOrder = 1
  end
  object ListBox2: TListBox
    Left = 192
    Top = 8
    Width = 121
    Height = 97
    ItemHeight = 13
    Items.Strings = (
      'Fish')
    TabOrder = 2
  end
  object ListBox3: TListBox
    Left = 8
    Top = 112
    Width = 305
    Height = 137
    ItemHeight = 13
    TabOrder = 3
  end
end

hope this helps.

Reinhard :-)

pastacool
Thanks to everyone who helped with this. This works perfectly for my needs! I juts hope it will work with Delphi 10 when I upgrade.
Brad
+1  A: 

You can use the SearchBuf function (see the pastacool's answer) IF you are not interested in other characters except A..Z / Unicode.

If you have an Unicode Delphi (D2009 or D2010) then you must use TCharacter.IsLetterOrDigit(aString: string; aIndex: integer): boolean; from the Character unit. A simple example for you to get the idea:

procedure TForm7.btn1Click(Sender: TObject);
var
  bMatches: boolean;

begin
  with rgx1 do //custom component - disregard it
  begin
    RegEx:=edtTextToFind.Text; //text to find
    Subject:=mmoResult.Text; //text in which to search
    if Match then //aha! found it!
    begin
      bMatches:=True;
      if chkWholeWord.Checked then //be attentive from here!! - I think that's self explaining...
      begin
        if MatchedExpressionOffset>1 then
          bMatches:=not TCharacter.IsLetterOrDigit(Subject, MatchedExpressionOffset-1);
        if bMatches and (MatchedExpressionOffset+MatchedExpressionLength<=Length(Subject)) then
          bMatches:=not TCharacter.IsLetterOrDigit(Subject, MatchedExpressionOffset+MatchedExpressionLength);
      end;
      if bMatches then //select it in the memo
      begin
        mmoResult.SelStart:=MatchedExpressionOffset-1;
        mmoResult.SelLength:=MatchedExpressionLength;
        mmoResult.SetFocus;
      end
      else
        ShowMessage('Text not found!');
    end
    else
      ShowMessage('Text not found!');
  end;
end;
+1  A: 

Change your function to read:

function ExistWordInString(aString:PAnsichar;
   aSearchString:string;
   aSearchOptions: TStringSearchOptions): Boolean;
var 
  b : boolean;
begin
  if soWholeWord in aSearchOptions then
    b := Pos(' '+Uppercase(aSearchString)+' ',' '+UpperCase(aString)+' ') > 0;
  else
    b := Pos(UpperCase(aSearchString),UpperCase(aString)) > 0;
  Result := b;
end;

If your using Delphi 2009/2010 then change it from Pos to AnsiPos. My assumption here is that soWholeWord means that the match "Fish" would match "Fish Sticks" but not "catfish".

skamradt