tags:

views:

88

answers:

5

Hi,

I have a very large table with over 1000 records and 200 columns. When I try to retreive records matching some criteria in the WHERE clause using SELECT statement it takes a lot of time. But most of the time I just want to select a single record that matches the criteria in the WHERE clause rather than all the records.

I guess there should be a way to select just a single record and exit which would minimize the retrieval time. I tried ROWNUM=1 in the WHERE clause but it didn't really work because I guess the engine still checks all the records even after finding the first record matching the WHERE criteria. Is there a way to optimize in case if I want to select just a few records?
Thanks in advance.

Edit:

I am using oracle 10g. The Query looks like,

Select * 
from Really_Big_table 
where column1 is NOT NULL 
and column2 is NOT NULL 
and rownum=1;

This seems to work slower than the version without rownum=1;

A: 

In SQL most of the optimization would come in the form on index on the table (where you would index the columns that appear in the WHERE and ORDER BY columns as a rough guide).

You did not specify what SQL database you are using, so I can't point to a good resource.

Here is an introduction to indexes on Oracle.

Here another tutorial.

As for queries - you should always specify the columns you are returning and not use a blanket *.

Oded
Thanks. I am using oracle 10g.
learner135
But It seems both the columns in that table were already indexed!
learner135
+1  A: 

rownum is what you want, but you need to perform your main query as a subquery.

For example, if your original query is:

  SELECT co1, col2
    FROM table
    WHERE condition

then you should try

  SELECT *
  FROM (
    SELECT col1, col2
      FROM table
      WHERE condition
  ) WHERE rownum <= 1

See http://www.oracle.com/technology/oramag/oracle/06-sep/o56asktom.html for details on how rownum works in Oracle.

Simon Nickerson
That query still takes a lot of time. Thanks for the link. I'll check that out.
learner135
there's no need to put the rownum predicate in an outer query here.
Jeffrey Kemp
A: 

An example is here: You can view more here

SELECT ename, sal 
FROM ( SELECT ename, sal, RANK() OVER (ORDER BY sal DESC) sal_rank
              FROM emp ) 
WHERE sal_rank <= 1;

You also have to do some column indexing for column in the WHERE clause

vodkhang
+1  A: 

1,000 records isn't a lot of data in a table. 200 columns is a reasonably wide table. For this reason, I'd suggest you aren't dealing with a really big table - I've performed queries against millions of rows with no problems.

Here is a little experiment... how long does it take to run this compared to the "SELECT *" query?

SELECT
    Really_Big_table.Id
FROM
    Really_Big_table
WHERE 
    column1 IS NOT NULL
AND
    column2 IS NOT NULL
AND
    rownum=1;
Sohnee
It still takes a lot of time :-(. Retreival was a bit faster without rownum=1. I can't fathom why!.
learner135
A: 

Hi learner135,

it shouldn't take a lot of time to query a 1000 rows table. There are exceptions however, check if you are in one of the following cases:

1. Lots of rows were deleted

The table had a massive amount of rows in the past. Since the High Water Mark (HWM) is still high (delete won't lower it) and FULL TABLE SCAN read all the data up to the high water mark, it may take a lot of time to return results even if the table is now nearly empty.

Analyse your table (dbms_stats.gather_table_stats('<owner>','<table>')) and compare the space actually used by the rows (space on disk) with the effective space (data), for example:

SELECT t.avg_row_len * t.num_rows data_bytes, 
       (t.blocks - t.empty_blocks) * ts.block_size bytes_used
  FROM user_tables t
  JOIN user_tablespaces ts ON t.tablespace_name = ts.tablespace_name
 WHERE t.table_name = '<your_table>';

You will have to take into account the overhead of the rows and blocks as well as the space reserved for update (PCT_FREE). If you see that you use a lot more space than required (typical overhead is below 30%, YMMV) you may want to reset the HWM, either:

  • ALTER TABLE <your_table> MOVE; and then rebuild INDEX (ALTER INDEX <index> REBUILD), don't forget to collect stats afterwards.
  • use DBMS_REDEFINITION

2. The table has very large columns

Check if you have columns of datatype LOB, CLOB, LONG (irk), etc. Data over 4000 bytes in any of these columns is stored out of line (in a separate segment), which means that if you don't select these columns you will only query the other smaller columns.

If you are in this case, don't use SELECT *. Either you don't need the data in the large columns or use SELECT rowid and then do a second query : SELECT * WHERE rowid = <rowid>.

Vincent Malgrat
I guess this could be a reason.Analyzing the table takes forever and haven't finished yet(It has been running for over an hour). The database is on a remote server and I am connecting to it via sql developer.
learner135
@learner135: if analysing the table takes too long, use the `ESTIMATE_PERCENT` parameter: `dbms_stats.gather_table_stats('<owner>','<table>', estimate_percent=>1)`
Vincent Malgrat
@Vincent: Nope still no joy. Still running for over an hour. :(
learner135