views:

89

answers:

1

Given two strings with * wildcards, I would like to know if a string could be created that would match both.

For example, these two are a simple case of overlap:

  1. Hello*World
  2. Hel*

But so are all of these:

  1. *.csv
  2. reports*.csv
  3. reportsdump.csv

Is there an algorithm published for doing this? Or perhaps a utility function in Windows or a library I might be able to call or copy?

+2  A: 

Since every glob can be written as a regular expression and the intersection of two regular expressions can be found (unless they aren't really regular, but they would be in this case), you can find the intersection of two globs by transforming them into regular expressions and then finding the intersection of those. So you can find out whether two globs intersect by finding the intersection of the regular expressions and checking whether it's empty.

However since globs are more limited than regular expression, there is a much easier way:

Let's call the two globs g1 and g2. They intersect iff

  1. Both g1 and g2 are empty or only contain wildcards.
  2. Neither g1 nor g2 are empty and one of the following conditions is true (let c1 be the first character of g1 and t1 the string containing the remaining characters - same for g2 with c2 and t2):
    1. c1 and c2 are equal and t1 and t2 intersect
    2. c1 and/or c2 is a wildcard and t1 intersects with g2
    3. c1 and/or c2 is a wildcard and g1 intersects with t2

An example implementation in haskell:

intersect g1          []          = all (== '*') g1
intersect []          g2          = all (== '*') g2
intersect g1@('*':t1) g2@(c2:t2)  = intersect g1 t2 || intersect t1 g2
intersect g1@(c1:t1)  g2@('*':t2) = intersect t1 g2 || intersect g1 t2
intersect    (c1:t1)     (c2:t2)  = c1 == c2        && intersect t1 t2

This algorithm isn't particular efficient if the globs contain a lot of wildcards, but it's very easy to implement and since you're likely planning to use this with filenames, I doubt you'll have globs longer than 1000 chars.

sepp2k