views:

75

answers:

2

We have a table in which we store millions of small BLOBs (2-10kb). And we need to access a portion of this table at a time, iterating through it while retrieving all these BLOBs (this "portion" is usually in the order of 10**5 rows).

We use oracle 11g R1.

Right now I am thinking about tuning the OracleConnection.setDefaultRowPrefetch() and give it a hint it's read-only.

What other tuning could be possible?

Also, does somebody have experience with handling small BLOBs with oracle 11g R2? Somebody told me that it is better optimized compared to R1 for handling small BLOBs, but I was wondering whether it's worth the try.

Any advice is appreciated!

EDIT: All rows in the "portion" will be used -- they will be processed into a special big binary file that will be consumed by another entity. So the iteration itself cannot be avoided.

EDIT: Our current DDL (partial, modified)

TABLESPACE "BLOB_STUFF" LOB(
 "STUFF"
)
STORE AS BASICFILE "BLOBS_FILE"(
TABLESPACE "BLOBS" ENABLE STORAGE IN ROW CHUNK 8192 NOCACHE LOGGING STORAGE (INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
);
+1  A: 

we need to access a portion of this table at a time, iterating through it while retrieving all these BLOBs

There is a common anti-pattern for OO programmers to write their application so that it just grabs a slack handful of rows from the database and then iterate through them in the middle tier to winnow the rows which are required from the chaff. A more efficient approach is to write a focused query which retrieves just the precise rows.

If this is not what you're doing then you should edit your question to clarify your processing. If it is what you're doing you should explain why you need to iterate through one hundred thousand records at a time.

APC
Thanks for the comment! I edited the question per your advice.
Enno Shioji
+1  A: 

Something to consider at the Oracle level:

Ensure the LOB column is created (or altered) with the CACHE and ENABLE STORAGE IN ROW clauses -- otherwise every read for every row will require a two direct path reads, which will be slooooow. Smaller LOBs will be stored inline, and larger LOBs will then be stored out-of-line.

Adam Musch
Thanks Adam. I posted our current DDL. Guess we are already storing in-line..
Enno Shioji
Both are important: Direct Path Reads require synchronous I/O operations for every row in the data set.
Adam Musch
Thanks for your follow up. Unfortunately something else came up and I can't test your suggestion right now, but I'll try as soon as I have my hands free!
Enno Shioji