views:

124

answers:

6

I have an app using an Oracle 11g database. I have a fairly large table (~50k rows) which I query thus:

SELECT omg, ponies FROM table WHERE x = 4

Field x was not indexed, I discovered. This query happens a lot, but the thing is that the performance wasn't too bad. Adding an index on x did make the queries approximately twice as fast, which is far less than I expected. On, say, MySQL, it would've made the query ten times faster, at the very least. (Edit: I did test this on MySQL, and there saw a huge difference.)

I'm suspecting Oracle adds some kind of automatic index when it detects that I query a non-indexed field often. Am I correct? I can find nothing even implying this in the docs.

+1  A: 

Does Oracle 11g automatically index fields frequently used for full table scans?

No.

cagcowboy
A: 

You should have a look at the estimated execution plan for your query, before and after the index has been created. (Also, make sure that the statistics are up-to-date on your table.) That will tell you what exactly is happening and why performance is what it is.

50k rows is not that big of a table, so I wouldn't be surprised if the performance was decent even without the index. Thus adding the index to equation can't really bring much improvement to query execution speed.

Tommi
As I said in my comment to APC http://stackoverflow.com/questions/2935782/does-oracle-11g-automatically-index-fields-frequently-used-for-full-table-scans/2935973#2935973 , I tested this on another DBMS and there the difference was an order of magnitude. Maybe that says more about the other DBMS's table scan performance, though.
gustafc
+1  A: 

"Adding an index on x did make the queries approximately twice as fast, which is far less than I expected. On, say, MySQL, it would've made the query ten times faster, at the very least."

How many distinct values of X are there? Are they clustered in one part of the table or spread evenly throughout it?

Indexes are not some voodoo device: they must obey the laws of physics.

edit

"Duplicates could appear, but as it is, there are none."

If that column has neither a unique constraint nor a unique index the optimizer will choose an execution path on the basis that there could be duplicate values in that column. This is the value of declaring the data model as accuratley as possible: the provision of metadata to the optimizer. Keeping the statistics up to date is also very useful in this regard.

APC
The field is effectively unique. All rows are selected equally often, so it wasn't blind luck that happened to improve the performance. I tested a similar scenario on MySQL and there I saw the ten-fold difference in performance I expected (I know, apples and oranges).
gustafc
@gustafc - "The field is effectively unique" - is that the same as unique?
APC
Yes, as it happens. Duplicates *could* appear, but as it is, there are none.
gustafc
+2  A: 

As has already been indicated, Oracle11g does NOT dynamically build indexes based on prior experience. It is certainly possible and indeed happens often that adding an index under the right conditions will produce the order of magnitude improvement you note.

But as has also already been noted, 50K (seemingly short?) rows is nothing to Oracle. The Oracle database in fact has a great deal of intelligence that allows it to scan data without indexes most efficiently. Every new release of the Oracle RDBMS gets better at moving large amounts of data. I would suggest to you that the reason Oracle was so close to its "best" timing even without the index as compared to MySQL is that Oracle is just a more intelligent database under the covers.

However, the Oracle RDBMS does have many features that touch upon the subject area you have opened. For example:

10g introduced a feature called AUTOMATIC SQL TUNING which is exposed via a gui known as the SQL TUNING ADVISOR. This feature is intended to analyze queries on its own, in depth and includes the ability to do WHAT-IF analysis of alternative query plans. This includes simulation of indexes which do not actually exist. However, this would not explain any performance differences you have seen because the feature needs to be turned on and it does not actually build any indexes, it only makes recommendations for the DBA to make indexes, among other things.

11g includes AUTOMATIC STATISTICS GATHERING which when enabled will automatically collect statistics on database objects as it deems necessary based on activity on those objects.

Thus the Oracle RDBMS is doing what you have suggested, dynamically altering its environment on its own based on its experience with your workload over time in order to improve performance. Creating indexes on the fly is just not one of the things is does yet. As an aside, this has been hinted to by Oracle in private sevearl times so I figure it is in the works for some future release.

Kevin Meade
Very informative. Thank you!
gustafc
A: 

In regards the MySQL issue, what storage engine you use can make a difference.

"MyISAM relies on the operating system for caching reads and writes to the data rows while InnoDB does this within the engine itself"

Oracle will cache the table/data rows, so it won't need to hit the disk. depending on the OS and hardware, there's a chance that MySQL MyISAM had to physically read the data off the disk each time.

Gary
A: 

~50K rows, depending greatly on how big each row is, could conceivably be stored in under 1000 blocks, which could be quickly read into the buffer cache by a full table scan (FTS) in under 50 multi-block reads.

Adding appropriate index(es) will allow queries on the table to scale smoothly as the data volume and/or access frequency goes up.

Jeffrey Kemp