views:

150

answers:

6

Maybe someone can explain this to me, but when querying a data table from Oracle, where multiple records exist for a key (say a customer ID), the record that appears first for that customer can vary if there is no implicit "order by" statement enforcing the order by say an alternate field such as a transaction type. So running the same query on the same table could yield a different record ordering than from 10 minutes ago.

E.g., one run could yield:

Cust_ID, Transaction_Type
123 A
123 B

Unless an "order by Transaction_Type" clause is used, Oracle could arbitrarily return the following result the next time the query is run:

Cust_ID, Transaction_Type
123 B
123 A

I guess I was under the impression that there was a database default ordering of rows in Oracle which (perhaps) reflected the physical ordering on the disk medium. In other words, an arbitrary order that is immutable and would guarantee the same result when a query is rerun.

Does this have to do with the optimizer and how it decides where to most efficiently retrieve the data?

Of course the best practice from a programming perspective is to force whatever ordering is required, I was just a little unsettled by this behavior.

+5  A: 

There is no default ordering, ever. If you don't specify ORDER BY, you can get the same result the first 10000 times, then it can change.

Note that this is also true even with ORDER BY for equal values. For example:

Col1 Col2
1    1
2    1
3    2
4    2

If you use ORDER BY Col2, you still don't know if row 1 or 2 will come first.

egrunin
+13  A: 

The order of rows returned to the application from a SELECT statement is COMPLETELY ARBITRARY unless otherwise specified. If you want, need, or expect rows to return in a certain order, it is the user's responsibility to specify such an order.

(Caveat: Some versions of Oracle would implicitly sort data in ascending order if certain operations were used, such as DISTINCT, UNION, MINUS, INTERSECT, or GROUP BY. However, as Oracle has implemented hash sorting, the nature of the sort of the data can vary, and lots of SQL relying on that feature broke.)

Adam Musch
+1 (or more accurately, lots of SQL relying on that feature were *exposed as having always been broken*)
Jeffrey Kemp
I almost wish sometimes the ORGANIZATION clause was required, so people would realize that by default in Oracle, tables are heap-organized.
Adam Musch
@Adam Musch - +1 - but this would require that people who have used Oracle for years (self included) would have to understand what the ORGANIZATION clause of CREATE TABLE means (and thank you, your comment prompted me to go read up on this).
Bob Jarvis
Thanks all for your input. I think the reason this behavior was counterintuitive to me as I am a statistician and typically would pull data into a SAS dataset (which is like the analog of a table in Oracle.) Once such data is pulled into the SAS format, the ordering (however arbitrary initially) is fixed as the data are cached to disk in a specific order. Again, I would never say that I am relying on this default order in my programming, just that a piece of code would return an identical result each time even if no order is specified/forced/requested. Thanks for the clarification.
jgunnink
+4  A: 

The simple answer is that the SQL standard says that there is no default order for queries that do not have an ORDER BY statement, so you should never assume one.

The real reason would probably relate to the hashes assigned to each row as it is pulled into the record set. There is no reason to assume consistent hashing.

Matthew Flynn
+5  A: 

Just image the rows in a table like balls in a basket. Do the balls have an order?

I dont't think there is any DBMS that guarantees an order if ORDER BY is not specified.

Some might always return the rows in the order they were inserted, but that is an implementation side effect.

Some execution plans might cause the result set to be ordered even without an ORDER BY, but again this is an implementation side-effect that you should not rely on.

a_horse_with_no_name
+3  A: 

If an ORDER BY clause is not present the database (not just Oracle - any relational database) is free to return rows in whatever order it happens to find them. This will vary depending on the query plan chosen by the optimizer.

If the order in which the rows are returned matters you must use an ORDER BY clause. You may sometimes get lucky and the rows will come back in the order you want them to be even without an ORDER BY, but there is no guarantee that A) you will get lucky on other queries, and B) the order in which the rows are returned tomorrow will be the same as the order in which they're returned today.

In addition, updates to the database product may change the behavior of queries. We had to scramble a bit when doing a major version upgrade last year when we found that Oracle 10 returned GROUP BY results in a different order than did Oracle 9. Reason - no ORDER BY clause.

ORDER BY - when the order of the returned data really matters.

Bob Jarvis
Wow, group by doesn't automatically imply order by. Thanks, I'll have to remember that. This is another instance where SAS SQL (called proc sql) differs from Oracle SQL. Proc SQL would actually ignore an order by as superfluous when a group by clause is present.
jgunnink
A: 

if you don't use ORDER BY, the order is arbitrary; however, dependent on phisical storage and memory aspects. so, if you repeat the same query hundreds of times in 10 minutes, you will get almost the same order everytime, because probably nothing changes.

Things that could change the "noorder order" are:

  • the executing plan - if is changed(you have pointed that)
  • inserts and deletes on the tables involved in the query.
  • other things like presence in memory of the rows.(other querys on other tables could influence that)
Florin Ghita
You're right, for the most part nothing will change. I just noticed that in a query returning a million rows, a couple thousand might be different between successive runs. My co-worker who is more expert on Oracle has explained to me that even if the code run and tables referenced are static, the results could based on a number of run-time factors, some of which you have mentioned. It actually gives me a new appreciation for how much Oracle is doing behind the scenes to maximize the speed of retrieval.
jgunnink