tags:

views:

85

answers:

1
'find ./ -name *.jpg'

I am trying to optimize 'find' command for the above statement.

method which handle the '-name' predicate in find implementation.


static boolean

pred__name __common (const char *pathname, const char *str, int flags)

{

   boolean b;

   char *base = base_name (pathname);

   strip__trailing __slashes(base);

   b = fnmatch (str, base, flags) == 0;

   free (base);

   return b;

}


since I am looking for file extensions and want to avoid the regular expression based string matching, I replaced 'b = fnmatch (str, base, flags) == 0;' with following statements

int strLen = strlen(base);

b = FNM_NOMATCH;

if  (strLen>=4 && (str[3] == base[strLen]) && 
    (str[2] == base[strLen -1]) && (str[1] ==   
    base[strLen-2]) && (str[0] == base[strLen-3]))

{

b = 0;

}

After this I expected some performance gain, but I don't see any kind of performance gain after the above change.

  1. Is that I am doing some thing wrong?
  2. is there a better way to optimize the 'find' to search only for file extensions?
+4  A: 

I doubt that the regex matching is the bottleneck. Since find traverses the filesystem, the overhead is probably in disk seek times, and in case of an in-memory filesystem, in system calls and the resulting context switches.

Thomas
if the overhead is only in disk seek times then CPU usage should not be 30%.only 2 major parts in find1. filesystem iteration(low CPU operation)2. parsing files(high CPU operation)
Girish Kolari
@kolari: you should do some profiling and find out where the time is actually spent. Just optimizing a random part of the program won't help much.
unbeknown
+1 that the bottleneck is not in the pattern matching. Also, the -name switch does not even do full regex, it just does the much simpler typical shell glob expansion.
netjeff