tags:

views:

878

answers:

6

I did this tests and the results seems the count function scale linearly. I have another function relying strongly in the efficiency to know if there are any data, so I would like to know how to replace this select count(*) with another more efficient (maybe constant?) query or data structure.

psql -d testdb -U postgres -f truncate_and_insert_1000_rows.sql > NUL

psql -d testdb -U postgres -f count_data.sql

--------------------------------------------------------------------------------

Aggregate (cost=36.75..36.76 rows=1 width=0) (actual time=0.762..0.763 rows=1 loops=1) -> Seq Scan on datos (cost=0.00..31.40 rows=2140 width=0) (actual time=0.02 8..0.468 rows=1000 loops=1) Total runtime: 0.846 ms (3 filas)

psql -d testdb -U postgres -f truncate_and_insert_10000_rows.sql > NUL

psql -d testdb -U postgres -f count_data.sql

--------------------------------------------------------------------------------

Aggregate (cost=197.84..197.85 rows=1 width=0) (actual time=6.191..6.191 rows= 1 loops=1) -> Seq Scan on datos (cost=0.00..173.07 rows=9907 width=0) (actual time=0.0 09..3.407 rows=10000 loops=1) Total runtime: 6.271 ms (3 filas)

psql -d testdb -U postgres -f truncate_and_insert_100000_rows.sql > NUL

psql -d testdb -U postgres -f count_data.sql

--------------------------------------------------------------------------------

Aggregate (cost=2051.60..2051.61 rows=1 width=0) (actual time=74.075..74.076 r ows=1 loops=1) -> Seq Scan on datos (cost=0.00..1788.48 rows=105248 width=0) (actual time= 0.032..46.024 rows=100000 loops=1) Total runtime: 74.164 ms (3 filas)

psql -d prueba -U postgres -f truncate_and_insert_1000000_rows.sql > NUL

psql -d testdb -U postgres -f count_data.sql

--------------------------------------------------------------------------------

Aggregate (cost=19720.00..19720.01 rows=1 width=0) (actual time=637.486..637.4 87 rows=1 loops=1) -> Seq Scan on datos (cost=0.00..17246.60 rows=989360 width=0) (actual time =0.028..358.831 rows=1000000 loops=1) Total runtime: 637.582 ms (3 filas)

the definition of data is

CREATE TABLE data
(
  id INTEGER NOT NULL,
  text VARCHAR(100),
  CONSTRAINT pk3 PRIMARY KEY (id)
);
A: 

How a count on the primary key field where it is NOT NULL, limiting the query at 1 response?

Since a primary key must exist, if there is one, you have data, yes?

warren
+1  A: 

If all you care about is 1 row or no rows. Limit your query to the first row - why count all of the rows just to find out if there's 1 or more, or zero...

use the equivalent of ROWNUM = 1 or TOP 1 or whatever postgres gives you.

PostgreSQL uses LIMIT 1
CesarB
Thanks, I hate it when people offer solutions for other platforms but I knew the concept surely had to be pretty similar.
+1  A: 

You may find this useful.

Milen A. Radev
That link is broken. Now the content is located at http://wiki.postgresql.org/wiki/Slow_Counting
Alex. S.
Thank you, fixed.
Milen A. Radev
+1  A: 

Try this:

SELECT t.primary_key IS NOT NULL FROM table t LIMIT 1;

You will get TRUE if there are records and NULL if there are none.

Endlessdeath
-1 Not the most optimal solution because you're examining a column from the table, which in the event of your examined column not having a NOT NULL constraint, may be inaccurate. It would be much better to `SELECT 1 FROM table LIMIT 1` instead.
Kenaniah
+4  A: 

select true from table limit 1;

Tometzky
+1  A: 
select exists(select * from your_table_here) as has_row
Michael Buen
Exists returns true if the inner query returns any rows. It doesn't actually examine the content of the rows, so the `*` doesn't imply field retrieval.
Donnie
Donnie: indeed. in fact, even if you put 1/0 inside SELECT FROM your_table_here, it will not divide-by-zero error
Michael Buen