views:

1401

answers:

4

I have an Informix SQL query which returns a set of rows. It was slightly modified for the new version of the site we've been working on and our QA noticed that the new version returns different results. After investigation we've found that the only difference between two queries were in the number of fields returned.

FROM, WHERE and ORDER BY clauses are identical and the column names in the SELECT part did not affect the results. It was only the number of fields which caused the problem.

Any ideas?

+1  A: 

Adding --+ ORDERED join-order directive fixes the problem by allowing you to get your results in predictable order each time.

The links goes to the description of how the directive works http://publib.boulder.ibm.com/infocenter/idshelp/v10/index.jsp?topic=/com.ibm.sqls.doc/sqls1144.htm

Use the ORDERED join-order directive to force the optimizer to join tables or views in the order in which they appear in the FROM clause of the query.

SELECT --+ ORDERED
   name, title, salary, dname
FROM dept, job, emp WHERE title = 'clerk' AND loc = 'Palo Alto' 
   AND emp.dno = dept.dno 
   AND emp.job= job.job;
Ilya Kochetov
+1  A: 

The Informix SQL engine uses the indices on the tables based on the columns we want to retrieve. When retrieving different columns we were using different indices and therefore getting the results in different order.

Ilya Kochetov
A: 

I take it that by 'fields' you mean the number of rows of output data? In my experience, people use 'fields' and 'columns' as synonyms. Given that the names in the select-list did not change, you presumably were only getting differences in the number of rows returned.

Given the same tables, input data and query, the size and content of the result set should be the same, regardless of query plan or server version. The sequencing of the result set can be different unless you impose an order on the results, but that's legitimate in any DBMS.

If you are getting different sizes of result sets, you should probably be contacting IBM Technical Support. At least one of the result sets is wrong, and wrong results are always serious.

Although hints might help the performance, and the standard advice of 'run UPDATE STATISTICS (with the appropriate sets of options)' usually helps, neither the presence nor absence of indexes should alter the result set when the underlying data is stable. (If the data is changing, there are a variety of issues and complications to worry about.)

Jonathan Leffler
A: 

I can think of only two explanations for this:

  1. An aggregate function is being used, like COUNT(DISTINCT column), or
  2. The additional columns being selected are from a table that's OUTER joined

I understand you don't wish to publish the SQL and table definitions, but that does make it hard to diagnose.

RET