views:

42

answers:

1

In Oracle, the full text search syntax of Contains operator is:

 CONTAINS(
          [schema.]column,
          text_query    VARCHAR2
          [,label       NUMBER]) RETURN NUMBER;

which means the text_query can not be more than 4000 characters long or an error will occur. I repeatedly have text_query longer than 4000 characters long in many cases. How would you, as an Oracle expert, suggest to get around such limitation if possible?

To further clarify the situation in which 4000 is easily reached is that if you combine many Contains Query Operators to construct your text_query, it is quite possible to exceed such 4000 characters limitation.

+3  A: 

The 4000 character limit is not some arbitrary boundary: it is the maximum amount of VARCHAR2 characters that Oracle SQL can handle.

4000 characters is a lot of text. In English it's around 600 words, or an A4 page and a bit in a reasonable point font. There are not many applications I can think of which require searching for such large chunks of verbiage. Even colleges checking students' essays for plagiarism would operate at no more than the paragraph level.

However, if you really have a situation in which matching on a scant 4000 characters generates false positives all you can do is split the query string into chunks and search on them. This means you have to use PL/SQL:

create or replace function big_search (p_search_text in varchar2) 
    return sys_refcursor
is
    return_value sys_refcursor;
    p_srch1 varchar2(4000);
    p_srch2 varchar2(4000);
begin

    dbms_output.put_line('search_length='||to_char(length(p_search_text)));

    p_srch1 := substr(p_search_text, 1, 4000);
    p_srch2 := substr(p_search_text, 4001, 4000);


    open return_value for 
        select docname
                , (score(1) + score(2))/2 as score
        from t23
        where contains ( text_column, p_srch1 , 1) != 0
        and  contains ( text_column, p_srch2 , 2) != 0;

    return return_value;
end;
/

If you don't know the size of the search text beforehand, then you'll need to use dynamic SQL to assemble this. Note that passing null search terms to CONTAINS() will hurl DRG-50901: text query parser syntax error.

APC
+1 I was thinking the same, but don't have the knowledge of text search requird to answer.
Tony Andrews
The situation may require dynamic SQL, could anyone give more pointers?
hko19
When the search text above is dynamically constructed from various query operator and exceed 4000 in total length, the presented logic of forcefully breaking up the query text into 4000 a piece and 'and' them together lead to wrong result or syntax error.
hko19