views:

1627

answers:

3

I used to think that Oracle does not index a row when one of the column values is null.

Some simple experimentation shows this to be not the case. I was able to run some queries unexpectedly accessing only indexes even though some columns were nullable (which of course was a pleasant surprise).

A Google search led to some blogs with conflicting answers: I have read that a row gets indexed unless all indexed columns are null, and also that a row gets indexed unless the leading column value for the index is null.

So, in what cases does a row not enter an index? Is this Oracle version specific?

+7  A: 

If any indexed column contains a non-null value that row will be indexed. As you can see in the following example only one row doesn't get indexed and that's the row which has NULL in both indexed columns. You can also see that Oracle definitely does index a row when the leading index column has a NULL value.

SQL> create table big_table as
  2       select object_id as pk_col
  3               , object_name as col_1
  4               , object_name as col_2
  5  from all_objects
  6  /

Table created.

SQL> select count(*) from big_table
  2  /

  COUNT(*)
----------
     69238

SQL> insert into big_table values (9999990, null, null)
  2  /

1 row created.

SQL> insert into big_table values (9999991, 'NEW COL 1', null)
  2  /

1 row created.

SQL> insert into big_table values (9999992, null, 'NEW COL 2')
  2  /

1 row created.

SQL> select count(*) from big_table
  2  /

  COUNT(*)
----------
     69241

SQL> create index big_i on big_table(col_1, col_2)
  2  /

Index created.

SQL> exec dbms_stats.gather_table_stats(user, 'BIG_TABLE', cascade=>TRUE)

PL/SQL procedure successfully completed.


SQL> select num_rows from user_indexes where index_name = 'BIG_I'
  2  /

  NUM_ROWS
----------
     69240

SQL> set autotrace traceonly exp
SQL>
SQL> select pk_col from big_table
  2  where col_1 = 'NEW COL 1'
  3  /

Execution Plan
----------------------------------------------------------
Plan hash value: 1387873879

-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |     2 |    60 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| BIG_TABLE |     2 |    60 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | BIG_I     |     2 |       |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("COL_1"='NEW COL 1')

SQL> select pk_col from big_table
  2  where col_2 = 'NEW COL 2'
  3  /

Execution Plan
----------------------------------------------------------
Plan hash value: 3993303771

-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           |     2 |    60 |   176   (1)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| BIG_TABLE |     2 |    60 |   176   (1)| 00:00:03 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("COL_2"='NEW COL 2')

SQL> select pk_col from big_table
  2  where col_1 is null
  3  and col_2 = 'NEW COL 2'
  4  /

Execution Plan
----------------------------------------------------------
Plan hash value: 1387873879

-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |     1 |    53 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| BIG_TABLE |     1 |    53 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | BIG_I     |     2 |       |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("COL_1" IS NULL AND "COL_2"='NEW COL 2')
       filter("COL_2"='NEW COL 2')

SQL> select pk_col from big_table
  2  where col_1 is null
  3  and col_2 is null
  4  /

Execution Plan
----------------------------------------------------------
Plan hash value: 3993303771

-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           |     1 |    53 |   176   (1)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| BIG_TABLE |     1 |    53 |   176   (1)| 00:00:03 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("COL_1" IS NULL AND "COL_2" IS NULL)

SQL>

This example run on Oracle 11.1.0.6. But I'm pretty confident it holds true for all versions.

APC
+1 for the nice example
Rob van Wijk
+1  A: 

In addition to APC's answer, NULLS are indexed in bitmap indexes.

WW
+3  A: 

And in addition to APC's answer: when you want to index a NULL value, you can add a constant expression to the index.

Example:

SQL> select * from v$version where rownum = 1
  2  /

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi

1 rij is geselecteerd.

SQL> create table t (id,status,fill)
  2  as
  3   select level
  4        , nullif(ceil((level-1)/1000),0)
  5        , lpad('*',1000,'*')
  6     from dual
  7  connect by level <= 10000
  8  /

Tabel is aangemaakt.

SQL> select status
  2       , count(*)
  3    from t
  4   group by status
  5  /

    STATUS   COUNT(*)
---------- ----------
         1       1000
         2       1000
         3       1000
         4       1000
         5       1000
         6       1000
         7       1000
         8       1000
         9       1000
        10        999
                    1

11 rijen zijn geselecteerd.

SQL> create index i_status on t(status)
  2  /

Index is aangemaakt.

SQL> exec dbms_stats.gather_table_stats(user,'t',cascade=>true)

PL/SQL-procedure is geslaagd.

SQL> set autotrace traceonly
SQL> select *
  2    from t
  3   where status is null
  4  /

1 rij is geselecteerd.


Uitvoeringspan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=201 Card=1 Bytes=1007)
   1    0   TABLE ACCESS (FULL) OF 'T' (TABLE) (Cost=201 Card=1 Bytes=1007)


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        364  consistent gets
          0  physical reads
          0  redo size
       1265  bytes sent via SQL*Net to client
        242  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

Please note the full table scan and the 364 consistent gets.

SQL> set autotrace off
SQL> create index i_status2 on t(status,1)
  2  /

Index is aangemaakt.

SQL> set autotrace traceonly
SQL> select *
  2    from t
  3   where status is null
  4  /

1 rij is geselecteerd.


Uitvoeringspan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=1 Card=1 Bytes=1007)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'T' (TABLE) (Cost=1 Card=1 Bytes=1007)
   2    1     INDEX (RANGE SCAN) OF 'I_STATUS2' (INDEX) (Cost=1 Card=1)


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          3  consistent gets
          1  physical reads
          0  redo size
       1265  bytes sent via SQL*Net to client
        242  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

And now it uses the index and has only 3 consistent gets.

Regards, Rob.

Rob van Wijk