views:

768

answers:

1

Scenario:

I am using a T-SQL stored proc (Sql Server Management Studio) to return search matches for text documents using the MS Indexing Service and this (simplified) query:

SELECT * 
FROM openquery( 
  filesystem2, 
  'SELECT 
     Path, Rank, Filename
   FROM 
     SCOPE('' "e:\test\documents" '')
   WHERE 
     CONTAINS('' FORMSOF (INFLECTIONAL, "test" ) '')
   ') b

This query stopped working properly a few days ago. Though not fully substantiated, it seems that the interaction between the Property Cache and the Master Index is not working properly because I can find the desired documents by either,

1) removing the SCOPE parameter (i.e. just using "FROM SCOPE()" as the FROM clause

2) removing the WHERE clause (and keeping the SCOPE function as is)

So, I can "find" the desired documents by just content or by just locale, but not by using both together.

One option would be to reindex the catalog, but reindexing is, for now, only an option of last resort.

That being said, I rewrote the query to exclude the specified SCOPE and include an additional WHERE clause:

SELECT * 
FROM openquery( 
  filesystem2, 
  'SELECT 
     Path, Rank, Filename
   FROM 
     SCOPE()
   WHERE 
     CONTAINS('' FORMSOF (INFLECTIONAL, "test" ) '') and 
     Path like ''%e:\test\documents%''
   ') b

This query returns the proper documents when searching. However, I was/am concerned about a potential performance hit using the LIKE keyword. So, I investigated each query's execution plan, but they were exactly the same...which tells me 1 of two things:

1) the Querying Component of the Indexing Service optimizes both queries in such a way as to make them equal.

2) The query analyzer does not provide accurate feedback for remote queries when no DB tables are referenced.

Questions (in no particular order). Does anyone have any insight into the following?:

1) What could cause the behavior of the original problem between the Property Cache and the Master Index described in the scenario above?

2) Regarding the Execution plan,

a) Would the Querying Component process/optimize both queries the same?

b) Can Sql Server Management Studio provide execution plan feedback for openquery queries that do not reference any DB tables?

3) Finally, Which query is more efficient/faster, and why?

a) i.e. should I use the second one because it solves my problem?

Thank you!

+1  A: 

null values might be a problem. I'm not sure about this exact case, but sometimes including "where xxx is not null" can make a real difference.

Another option sometimes is to put where conditions on the table after the open query select aaa, bbb from openquery(.....) where aaa = zzz. (For better style, select the columns you need instead of *.

As for which is more efficient or faster, you may have to wrap the query with a simple timing process and judge for yourself if you can't use the metrics provided by the SQL Management default messages.

In the end, as long as your query works and does not break any standards that you have set for your project, yes - use it.

Doug L.
Thanks for the insight. I have fiddled with the queries a bit more using your suggestions...and decided to go ahead and use the solution.
dda