views:

245

answers:

2

I'm using Delphi 2007 and have an application that read log-files from several places over an internal network and display exceptions. Those directories sometimes contains thousands of log-files. The application have an option to read only log-files from the n latest days bit it can also be in any datetime range.

The problem is that the first time the log directory is read it can be very slow (several minutes). The second time it is considerably faster.

I wonder how I can optimize my code to read the log-files as fast as possible ? I'm using a vCurrentFile: TStringList to store the file in memory. That is updated from a FileStream as I think this is faster.

Here is some code:

Refresh: The main loop to read log-files

// In this function the logfiles are searched for exceptions. If a exception is found it is stored in a object.
// The exceptions are then shown in the grid
procedure TfrmMain.Refresh;
var
  FileData : TSearchRec;  // Used for the file searching. Contains data of the file
  vPos, i, PathIndex : Integer;
  vCurrentFile: TStringList;
  vDate: TDateTime;
  vFileStream: TFileStream;
begin
  tvMain.DataController.RecordCount := 0;
  vCurrentFile := TStringList.Create;
  memCallStack.Clear;

  try
    for PathIndex := 0 to fPathList.Count - 1 do                      // Loop 0. This loops until all directories are searched through
    begin
      if (FindFirst (fPathList[PathIndex] + '\*.log', faAnyFile, FileData) = 0) then
      repeat                                                      // Loop 1. This loops while there are .log files in Folder (CurrentPath)
        vDate := FileDateToDateTime(FileData.Time);

        if chkLogNames.Items[PathIndex].Checked and FileDateInInterval(vDate) then
        begin
          tvMain.BeginUpdate;       // To speed up the grid - delays the guichange until EndUpdate

          fPathPlusFile := fPathList[PathIndex] + '\' + FileData.Name;
          vFileStream := TFileStream.Create(fPathPlusFile, fmShareDenyNone);
          vCurrentFile.LoadFromStream(vFileStream);

          fUser := FindDataInRow(vCurrentFile[0], 'User');          // FindData Returns the string after 'User' until ' '
          fComputer := FindDataInRow(vCurrentFile[0], 'Computer');  // FindData Returns the string after 'Computer' until ' '

          Application.ProcessMessages;                  // Give some priority to the User Interface

          if not CancelForm.IsCanceled then
          begin
            if rdException.Checked then
              for i := 0 to vCurrentFile.Count - 1 do
              begin
                vPos := AnsiPos(MainExceptionToFind, vCurrentFile[i]);
                if vPos > 0 then
                  UpdateView(vCurrentFile[i], i, MainException);

                vPos := AnsiPos(ExceptionHintToFind, vCurrentFile[i]);
                if vPos > 0 then
                  UpdateView(vCurrentFile[i], i, HintException);
              end
            else if rdOtherText.Checked then
              for i := 0 to vCurrentFile.Count - 1 do
              begin
                vPos := AnsiPos(txtTextToSearch.Text, vCurrentFile[i]);
                if vPos > 0 then
                  UpdateView(vCurrentFile[i], i, TextSearch)
              end
          end;

          vFileStream.Destroy;
          tvMain.EndUpdate;     // Now the Gui can be updated
        end;
      until(FindNext(FileData) <> 0) or (CancelForm.IsCanceled);     // End Loop 1
    end;                                                          // End Loop 0
  finally
    FreeAndNil(vCurrentFile);
  end;
end;

UpdateView method: Add one row to the displaygrid

{: Update the grid with one exception}
procedure TfrmMain.UpdateView(aLine: string; const aIndex, aType: Integer);
var
  vExceptionText: String;
  vDate: TDateTime;
begin
  if ExceptionDateInInterval(aLine, vDate) then     // Parse the date from the beginning of date
  begin
    if aType = MainException then
      vExceptionText := 'Exception'
    else if aType = HintException then
      vExceptionText := 'Exception Hint'
    else if aType = TextSearch then
      vExceptionText := 'Text Search';

    SetRow(aIndex, vDate, ExtractFilePath(fPathPlusFile), ExtractFileName(fPathPlusFile), fUser, fComputer, aLine, vExceptionText);
  end;
end;

Method to decide if row is in daterange:

{: This compare exact exception time against the filters
@desc 2 cases:    1. Last n days
                  2. From - to range}
function TfrmMain.ExceptionDateInInterval(var aTestLine: String; out aDateTime: TDateTime): Boolean;
var
  vtmpDate, vTmpTime: String;
  vDate, vTime: TDateTime;
  vIndex: Integer;
begin
  aDateTime := 0;
  vtmpDate := Copy(aTestLine, 0, 8);
  vTmpTime := Copy(aTestLine, 10, 9);

  Insert(DateSeparator, vtmpDate, 5);
  Insert(DateSeparator, vtmpDate, 8);

  if TryStrToDate(vtmpDate, vDate, fFormatSetting) and TryStrToTime(vTmpTime, vTime) then
    aDateTime := vDate + vTime;

  Result := (rdLatest.Checked and (aDateTime >= (Now - spnDateLast.Value))) or
            (rdInterval.Checked and (aDateTime>= dtpDateFrom.Date) and (aDateTime <= dtpDateTo.Date));

  if Result then
  begin
    vIndex := AnsiPos(']', aTestLine);

    if vIndex > 0 then
      Delete(aTestLine, 1, vIndex + 1);
  end;
end;

Test if the filedate is inside range:

{: Returns true if the logtime is within filters range
@desc Purpose is to sort out logfiles that are no idea to parse (wrong day)}
function TfrmMain.FileDateInInterval(aFileTimeStamp: TDate): Boolean;
begin
  Result := (rdLatest.Checked and (Int(aFileTimeStamp) >= Int(Now - spnDateLast.Value))) or
            (rdInterval.Checked and (Int(aFileTimeStamp) >= Int(dtpDateFrom.Date)) and (Int(aFileTimeStamp) <= Int(dtpDateTo.Date)));
end;
+2  A: 

The problem is not your logic, but the underlying file system.

Most file systems get very slow when you put many files in a directory. This is very bad with FAT, but NTFS also suffers from it, especially if you have thousands of files in a directory.

The best you can do is reorganize those directory structures, for instance by age.

Then have at most a couple of 100 files in each directory.

--jeroen

Jeroen Pluimers
That's true but this is the reality. In my case it is seldom searched files older than 30 days, so maybe I should move older files to an archive directory by a scheduled bat-file, speed up searching.
Roland Bengtsson
The reason your second time is faster, is that the server now has cached the info in memory. So there is nothing you can do from the Delphi side to really speed it up (You might be able to marginally speed things up, but users won't notice the effect). So the best is indeed to archive files that are not needed any more.
Jeroen Pluimers
A: 

How fast do you want to be? If you want to be really fast, then you need to use something besides windows networking to read the files. The reason is if you want to read the last line of a log file (or all the lines since the last time you read it) then you need to read the whole file again.

In your question you said the problem is that it is slow to enumerate your directory listing. That is your first bottleneck. If you want to be real fast then you need to either switch to HTTP or add some sort of log server to the machine where the log files are stored.

The advantage of using HTTP is you can do a range request and just get the new lines of the log file that were added since you last requested it. That will really improve performance since you are transferring less data (especially if you enable HTTP compression) and you also have less data to process on the client side.

If you add a log server of some sort, then that server can do the processing on the server side, where it has native access to the data, and only return the rows that are in the date range. A simple way of doing that may be to just put your logs into a SQL database of some sort, and then run queries against it.

So, how fast do you want to go?

Jim McKeeth
Yes, I was afraid of that there is not much to do in the program code. We had actually a project some years ago to store the logs in a database and access it as SQL but it was never finished. But as you said, if we assign a lot of time to prepare the logs on the servers we could probably get it quite fast. But this is not a high priority project :) Thanks anyway for you answer.
Roland Bengtsson
With some more coding effort you may avoid reading the whole file every time. If you are interested only in the last part of the file, you may seek directly to the last few KB and read there.
PA