views:

82

answers:

1

I am generating some test-data and use dbms_random. I encountered some strange behavior when using dbms_random in the condition of the JOIN, that I can not explain:

------------------------# test-data (ids 1 .. 3)
With x As (
  Select Rownum id From dual
  Connect By Rownum <= 3
)
------------------------# end of test-data
Select x.id,
       x2.id id2
  From x
  Join x x2 On ( x2.id = Floor(dbms_random.value(1, 4)) )


Floor(dbms_random.value(1, 4) ) returns a random number out of (1,2,3), so I would have expected all rows of x to be joined with a random row of x2, or maybe always the same random row of x2 in case the random number is evaluated only once.

When trying several times, I get results like that, though:

(1)   ID  ID2        (2)   ID  ID2        (3)
    ---- ----            ---- ----            no rows selected.
       1    2               1    3
       1    3               2    3
       2    2               3    3
       2    3
       3    2
       3    3

What am I missing?

EDIT:

SELECT ROWNUM, FLOOR(dbms_random.VALUE (1, 4))
FROM dual CONNECT BY ROWNUM <= 3

would get the result in this case, but why does the original query behave like that?

+1  A: 

To generate three rows with one predictable value and one random value, try this:

SQL> with x as (
  2    select rownum id from dual
  3    connect by rownum <= 3
  4      )
  5      , y as (
  6    select floor(dbms_random.value(1, 4)) floor_val
  7    from dual
  8      )
  9  select x.id,
 10         y.floor_val
 11  from x
 12  cross join y
 13  /

        ID  FLOOR_VAL
---------- ----------
         1          2
         2          3
         3          2

SQL

edit

Why did your original query return an inconsistent set of rows?

Well, without the random bit in the ON clause your query was basically a CROSS JOIN of X against X - it would have returned nine rows (at least it would have if the syntax had allowed it). Each of those nine rows executes a call to DBMS_RANDOM.VALUE(). Only when the random value matches the current value of X2.ID is the row included in the result set. Consequently the query can return 0-9 rows, randomly.

Your solution is obviously simpler - I didn't refactor enough :)

APC
Thanks, this could be done even easier, see my updated question. I was wondering why the results of my query are so strange, though... Any idea?
Peter Lang