views:

683

answers:

2

this query gets the dominating sets in a network. so for example given a network

A<----->B
B<----->C
B<----->D
C<----->E
D<----->C
D<----->E
F<----->E

it returns
B,E
B,F
A,E
but it doesn't work for large data because i'm using string methods in my result. i have been trying to remove the string methods and return a view or something but to no avail

With t as (select 'A' as per1, 'B' as per2 from dual union all
         select 'B','C' from dual union all
         select 'B','D' from dual union all
         select 'C','B' from dual union all
         select 'C','E' from dual union all
         select 'D','C' from dual union all
         select 'D','E' from dual union all
         select 'E','C' from dual union all
         select 'E','D' from dual union all
         select 'F','E' from dual)
 ,t2 as (select distinct least(per1, per2) as per1, greatest(per1, per2) as per2 from t union
       select distinct greatest(per1, per2) as per1, least(per1, per2) as per1 from t)
 ,t3 as (select per1, per2, row_number() over (partition by per1 order by per2) as rn from t2)
 ,people as (select per, row_number() over (order by per) rn
             from (select distinct per1 as per from t union
                   select distinct per2 from t)
            )
  ,comb   as (select sys_connect_by_path(per,',')||',' as p
              from   people
              connect by rn > prior rn
             )
  ,find   as (select p, per2, count(*) over (partition by p) as cnt
             from (
                   select distinct comb.p, t3.per2
                   from   comb, t3
                   where  instr(comb.p, ','||t3.per1||',') > 0 or instr(comb.p, ','||t3.per2||',') > 0
                  )
            )
 ,rnk as (select p, rank() over (order by length(p)) as rnk
          from find
          where cnt = (select count(*) from people)
          order by rnk
         )  select distinct trim(',' from p) as p from rnk  where rnk.rnk = 1`
A: 

In my experience you do not want to do complex string handling in large, complicated queries, and this query is quite complicated. I would guess that this problem very well could benefit from a rethink and a different approach rather than an optimization of the existing query.

What does the underlying tables look like and what exactly are you trying to achieve? Is it possible to alter the data model?

Chris
The data A<--->B, B<--->C ....,shown above are the underlying data. they represent a form of user and friend relation.I'm trying to find the minimal dominating set in this given networkThe minimal dominating set, in a social network, is a set of people who collectively are friends with every person in the networkMore info on dominating set here : http://en.wikipedia.org/wiki/Dominating_set
core_pro
+1  A: 

One of Oracle's limits is that SQL cannot handle VARCHAR2 bigger than 4000 characters. If you attempt to return a string exceeding this size it hurls ORA-01489. Ideally you should try to break down the resultset into multiple small rows. Alternatively you could return it as a CLOB.

edit

how i can return the above as a CLOB

Hmm...

Having looked closely at your code I thibk the only place which is going to hurl ORA-1489 is this line:

select sys_connect_by_path(per,',')||',' as p
from   people

It would be easy to wrap that call in TO_CLOB(). Unfortunately turning P into a CLOB breaks some of the subsequent processing ('distinct p,partition by p`) so it probably isn't an option. Sorry.

As for other workarounds....

Does your site have a license for Oracle Spatial? I know not many sites do, but if yours is one of the lucky one (and you're using 10gR2 or higher) then you should check out Oracle Spatial Network Data Model (PDF).

Otherwise, if there is no way to restrict the output of the sys_connect_by_path() call, you might just have to implement this in PL/SQL. You can use a PIPELINED FUNCTION to return the final output so you can still call it from a SELECT statement.

APC
CLOB?, how i can return the above as a CLOB. please explain more
core_pro