tags:

views:

1451

answers:

8

I've notice that in Oracle, the query

SELECT COUNT(*) FROM sometable;

is very slow for large tables. It seems like the database it actually going through every row and incrementing a counter one at a time. I would think that there would be a counter somewhere in the table how many rows that table has.

So if I want to check the number of rows in a table in Oracle, what is the fastest way to do that?

+1  A: 

try SELECT COUNT(NAME_OF_PRIMARY_KEY) FROM sometable instead

klausbyskov
-1: It will be optimised to a scan of the smallest segment anyway, whether that's the table, the PK index, or another non-null or bitmap index.
David Aldridge
+6  A: 

Hi Eli,

Think about it: the database really has to go to every row to do that. In a multi-user environment my COUNT(*) could be different from your COUNT(*). This would be impractical to have a different counter for each and every session so you have to litteraly count the rows. Most of the time anyway you would have a WHERE clause or a JOIN in your query so your hypothetical counter would be of litte practical value.

There are ways to speed up things however: if you have an INDEX on a NOT NULL column Oracle will count the rows of the index instead of the table. In a proper relational model all tables have a primary key so the COUNT(*) will use the index of the primary key.

Bitmap index have entries for NULL rows so a COUNT(*) will use a bitmap index if there is one available.

Vincent Malgrat
I have a primary key on this table on 4 columns, each of which is NOT NULL, and it still took 41 seconds to do a COUNT(*). Does the index have to be on a single column to make a COUNT(*) query fast?
Eli Courtwright
@Eli: The smaller the index key, the faster the query. If your table is big however, the `COUNT(*)` WILL take time no matter what. See some suggestions of alternatives to `COUNT(*)` in other answers (record the result in a materialized view, make a COUNT on a sample, ...)
Vincent Malgrat
A: 

If you have a primary key constraint on the table oracle will do a fast index scan of that to count the rows, otherwise it must do a table scan to count the rows.

Note that select count(<anything other than *>) does the exact same thing. The row is never actually fetched with count(*), just the existence of the row is noted.

The only way to speed it up from a software standpoint is to have a good PK on the table. Beyond that, you're down to optimizing disk access time on the hardware.

Donnie
It can also choose to scan other indexes other than the PK though.
David Aldridge
+4  A: 

If the table has an index on a NOT NULL column the COUNT(*) will use that. Otherwise it is executes a full table scan. Note that the index doesn't have to be UNIQUE it just has to be NOT NULL.

Here is a table...

SQL> desc big23
 Name                                      Null?    Type
 ----------------------------------------- -------- ---------------------------
 PK_COL                                    NOT NULL NUMBER
 COL_1                                              VARCHAR2(30)
 COL_2                                              VARCHAR2(30)
 COL_3                                              NUMBER
 COL_4                                              DATE
 COL_5                                              NUMBER
 NAME                                               VARCHAR2(10)

SQL>

First we'll do a count with no indexes ....

SQL> explain plan for
  2      select count(*) from big23
  3  /

Explained.

SQL> select * from table(dbms_xplan.display)
  2  /
select * from table)dbms_xplan.display)

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------
Plan hash value: 983596667

--------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Cost (%CPU)| Time     |
--------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |     1 |  1618   (1)| 00:00:20 |
|   1 |  SORT AGGREGATE    |       |     1 |            |          |
|   2 |   TABLE ACCESS FULL| BIG23 |   472K|  1618   (1)| 00:00:20 |
--------------------------------------------------------------------

Note

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------
   - dynamic sampling used for this statement

13 rows selected.

SQL>

No we create an index on a column which can contain NULL entries ...

SQL> create index i23 on big23(col_5)
  2  /

Index created.

SQL> delete from plan_table
  2  /

3 rows deleted.

SQL> explain plan for
  2      select count(*) from big23
  3  /

Explained.

SQL> select * from table(dbms_xplan.display)
  2  /

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------
Plan hash value: 983596667

--------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Cost (%CPU)| Time     |
--------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |     1 |  1618   (1)| 00:00:20 |
|   1 |  SORT AGGREGATE    |       |     1 |            |          |
|   2 |   TABLE ACCESS FULL| BIG23 |   472K|  1618   (1)| 00:00:20 |
--------------------------------------------------------------------

Note

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------
   - dynamic sampling used for this statement

13 rows selected.

SQL>

Finally let's build the index on the NOT NULL column ....

SQL> drop index i23
  2  /

Index dropped.

SQL> create index i23 on big23(pk_col)
  2  /

Index created.

SQL> delete from plan_table
  2  /

3 rows deleted.

SQL> explain plan for
  2      select count(*) from big23
  3  /

Explained.

SQL> select * from table(dbms_xplan.display)
  2  /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------
Plan hash value: 1352920814

----------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Cost (%CPU)| Time     |
----------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |     1 |   326   (1)| 00:00:04 |
|   1 |  SORT AGGREGATE       |      |     1 |            |          |
|   2 |   INDEX FAST FULL SCAN| I23  |   472K|   326   (1)| 00:00:04 |
----------------------------------------------------------------------

Note

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------
   - dynamic sampling used for this statement

13 rows selected.

SQL>
APC
+4  A: 

Option 1: Have an index on a non-null column present that can be used for the scan. Or create a function-based index as:

create index idx on t(0);

this can then be scanned to give the count.

Option 2: If you have monitoring turned on then check the monitoring view USER_TAB_MODIFICATIONS and add/subtract the relevant values to the table statistics.

Option 3: For a quick estimate on large tables invoke the SAMPLE clause ... for example ...

SELECT 1000*COUNT(*) FROM sometable SAMPLE(0.1);

Option 4: Use a materialized view to maintain the count(*). Powerful medicine though.

um ...

David Aldridge
+2  A: 

You can create a fast refresh materialized view to store the count.

Example:

create table sometable (
id number(10) not null primary key
, name varchar2(100) not null);

create materialized view log on sometable with rowid including new values;

create materialized view sometable_count
refresh on commit
as
select count(*) count
from   sometable;

insert into sometable values (1,'Raymond');
insert into sometable values (2,'Hans');

commit;

select count from sometable_count;

It will slow mutations on table sometable a bit but the counting will become a lot faster.

tuinstoel
+2  A: 

If you want just a rough estimate, you can extrapolate from a sample:

SELECT COUNT(*) * 100 FROM sometable SAMPLE (1);

For greater speed (but lower accuracy) you can reduce the sample size:

SELECT COUNT(*) * 1000 FROM sometable SAMPLE (0.1);

For even greater speed (but even worse accuracy) you can use block-wise sampling:

SELECT COUNT(*) * 100 FROM sometable SAMPLE BLOCK (1);

Jeffrey Kemp
+1  A: 

The fastest way to get a count of a table is exactly what you did. There are no tricks you can do that Oracle doesn't already know about.

There are somethings you have not told us. Namely why do you think think this should be faster?

For example:

  1. Have you at least done an explain plan to see what Oracle is doing?
  2. How many rows are there in this table?
  3. What version of Oracle are you using? 8,9,10,11 ... 7?
  4. Have you ever run database statistics on this table?
  5. Is this a frequently updated table or batch loaded or just static data?
  6. Is this the only slow COUNT(*) you have?
  7. How long does SELECT COUNT(*) FROM Dual take?

I'll admit I wouldn't be happy with 41 seconds but really WHY do you think it should be faster? If you tell us the table has 18 billion rows and is running on the laptop you bought from a garage sale in 2001, 41 seconds is probably not that far outside "good as it will get" unless you get better hardware. However if you say you are on Oracle 9 and you ran statistics last summer well you'll probably get a different suggestions.

David