views:

297

answers:

5

What would be the best way, in delphi, to create and store data which will often be searched on and modified?

Basically, I would like to write a function that searches an existing database for telephone numbers and keeps track of how many times each telephone number has been used, the first date used, and the latest date used. The database that is being searched is basically a log of orders placed, containing the telephone number that was used to place the order. It's not an SQL database or anything that can easily be queried for such things (it's an old btrieve database), so I need to create a way of gaining this information (to eventually output to a text file).

I am thinking of creating a record containing the phone number, the two dates, and the number of times used, and then adding a record to a dynamic array for each telephone number. I would then search the array, entry by entry, for each record in the database, to see if the phone number for the current record is already in the array. Then updating or creating a record as necessary.

This seems like it would work, but as there are tens of thousands of entries in the database, it may not be the best way, and a rather slow and inefficient way of doing things. Is there a better way, given the limited actions I can perform on the database?

Someone suggested that rather than using an array, use a MySQL table to keep track of the numbers, and then query each number for every database record. This seems like even more overhead though!

Thanks a lot for your time.

A: 

Sorry, couldn't edit my post (wasn't registered at the time). The data will be thrown away once all the records in the database have been iterated through. The function won't be called often. It's basically going to be used as a way of determining how often people have ordered over a period of time from records we already have, so really it's just needed to produce a one off list.

The data will be persistent for the duration of the creation of the list. That is, all telephone numbers will need to be present to be searched on until the very last database record is read.

+1  A: 

I would register the aggregates in a totally disconnected TClientDataset(cds), and updating the values as you get them from the looping. If the Btrieve could be sorted by telephone number, much better. Then use the data on the cds to generate the report.

(If you go this way, I suggest get Midas SpeedFix from the Andreas Hausladen' blog, along with the other finest stuff you can find there).

Fabricio Araujo
A: 

If you were going to keep it in memory and don't want anything fancy, you'd be better off using a TStringList so you can use the Find function. Find uses Hoare's selection or Quick-select, an O(n) locator. For instance, define a type:

type
   TPhoneData = class
      private
         fPhone:string;
         fFirstCalledDate:TDateTime;
         fLastCalledDate:TDateTime;
         fCallCount:integer;
      public
         constructor Create(phone:string; firstDate, lastDate:TDateTime);
         procedure updateCallData(date:TDateTime);
         property phoneNumber:string read fPhone write fPhone;
         property firstCalledDate:TDateTime read fFirstCalledDate write fFirstCalledDate;
         property lastCalledDate:TDateTime read fLastCalledDate write fLastCalledDate;
         property callCount:integer read fCallCount write fCallCount;
      end;

{ TPhoneData }

constructor TPhoneData.Create(phone: string; firstDate, lastDate: TDateTime);
begin
fCallCount:=1;
fFirstCalledDate:=firstDate;
fLastCalledDate:=lastDate;
fPhone:=phone;
end;

procedure TPhoneData.updateCallData(date: TDateTime);
begin
inc(fCallCount);
if fFirstCalledDate<date then fFirstCalledDate:=date;
if date>fLastCalledDate then fLastCalledDate:=date;
end;

and then fill it, report on it:

procedure TForm1.btnSortExampleClick(Sender: TObject);
const phoneSeed:array[0..9] of string = ('111-111-1111','222-222-2222','333-333-3333','444-444-4444','555-555-5555','666-666-6666','777-777-7777','888-888-8888','999-999-9999','000-000-0000');

var TSL:TStringList;
    TPD:TPhoneData;
    i,index:integer;
    phone:string;
begin
randseed;
TSL:=TStringList.Create;
TSL.Sorted:=true;
for i := 0 to 100 do
   begin
   phone:=phoneSeed[random(9)];
   if TSL.Find(phone, index) then
      TPhoneData(TSL.Objects[index]).updateCallData(now-random(100))
   else
      TSL.AddObject(phone,TPhoneData.Create(phone,now,now));
   end;
for i := 0 to 9 do
   begin
   if TSL.Find(phoneSeed[i], index) then
      begin
      TPD:=TPhoneData(TSL.Objects[index]);
      ShowMessage(Format('Phone # %s, first called %s, last called %s, num calls %d', [TPD.PhoneNumber, FormatDateTime('mm-dd-yyyy',TPD.firstCalledDate), FormatDateTime('mm-dd-yyyy',TPD.lastCalledDate), TPD.callCount]));
      end;
   end;
end;
Marshall Fryman
+1  A: 

Ok, here is a double pass old-school method that works well and should scale well (I used this approach against a multi-million record database once, it took time but gave accurate results).

  1. Download and install Turbo Power SysTools -- the sort engine works very well for this process.
  2. create a sort, with a fixed record of phone number, you will be using this to sort.
  3. Loop thru your records, at each order, add the phone number to the sort.
  4. Once the first iteration is done, start popping the phone numbers from the sort, increment a counter if the phone number is the same as the last one read, otherwise report the number and clear your counter.

This process can also be done with any SQL Database, but my experience has been that the sort method is faster than managing a temporary table and generates the same results.

EDIT -- You stated that this is a BTrieve database, why not just create a key on the phone number, sort on that key, then apply step 4 over this table (next instead of pop). Either way you will need to touch every record in your database to get counts, the index/sort just makes your decision process easier.

For example, lets say that you have two tables, one the customer table is where the results will be stored, and the other the orders table. Sort both by the same phone number. Then start a cursor at the top of both lists and then apply the following psuedocode:

Count := 0;
While (CustomerTable <> eof) and (OrderTable <> eof) do
  begin
    comp = comparetext( customer.phone, order.phone );
    while (comp = 0) and (not orderTable eof) do 
      begin
        inc( Count );
        order.next;
        comp = comparetext( customer.phone, order.phone );
      end;
    if comp < 0 then
      begin
        Customer.TotalCount = count;
        save customer;
        count := 0;
        Customer.next;
      end
    else if (Comp > 0) and (not OrderTable EOF) then
      begin
        Order.Next;  // order no customer
      end;  
   end;

// handle case where end of orders reached
if (OrdersTable EOF) and (not CustomersTable EOF) then
  begin
    Customer.TotalCount = count;
    save customer;
  end;

This code has the benefit of walking both lists once. There are no lookups necessary since both lists are sorted the same, they can be walked top to bottom taking action only when necessary. The only requirement is that both lists have something in common (in this example phone number) and both lists can be sorted.

I did not handle the case where there is an order and no customer. My assumption was that orders do not exist without customers and would be skipped for counting.

skamradt
A: 

Instead of a TStringList I would recommend using DeCAL's (on sf.net) DMap to store the items in memory. You could specify the phone is the key and store a Record/Class structure containing the rest of the record.

So your Record class will be:


  TPhoneData = class
    number: string;
    access_count: integer;
    added: TDateTime.
     ...
  end;

Then in code:


  procedure TSomeClass.RegisterPhone(number, phoneData);
  begin
    //FStore created in Constructor as FStore := DMap.Create;
    FStore.putPair([number, phoneData])
  end;
  ...
  procedure TSoemClass.GetPhoneAndIncrement(number);
  var
    Iter: DIterator;
    lPhoneData: TPhoneData;
  begin
    Iter := FStore.locate([number]);
    if atEnd(Iter) then
      raise Exception.CreateFmt('Number %s not found',[number])
    else
    begin
      lPhoneData := GetObject(Iter) as TPhoneData;
      lPhoneData.access_count = lPhoneData.access_count + 1;
      //no need to save back to FStore as it holds a pointer to lPhoneData
    end;
  end;

DMap implements a red/black tree so the data structure sorts the keys for you for free. You can also use a DHashMap for the same affect and (arguably) increased speed.

DeCAL is one of my favourite data structure libraries and would recommend anybody doing in-memory storage operations to have a look.

Hope that helps

Nazar