ansaurus

Question

What's wrong with this FIRST_VALUE query?

Answer 1

+4 A:

Default RANGE / ROWS for FIRST_VALUE (as for any other analytical function) is BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.

If you add IGNORE NULLS, then NULL values are not taken into account when building the range.

The RANGE becomes BETWEEEN UNBOUNDED PRECEDING AND CURRENT ROW EXCEPT FOR THE NULL ROWS (it's not a valid OVER clause).

Since your txt's that are NULL have high id's, they are selected first, and their ranges are empty, since there are no non-NULL rows between them and UNBOUNDED PRECEDING

You should change either ORDER BY or RANGE clause of your query.

Changing ORDER BY puts the rows with NULL id's to the end of the window so that a non-NULL value (if any) will be always selected first, and the RANGE will guaranteedly start from that value:

with t
as (
  select 450 id, null txt , 3488 id_usr from dual union all
  select 449   , null     , 3488        from dual union all
  select  79   , 'A'      , 3488        from dual union all
  select  78   , 'X'      , 3488        from dual 
)
select id
     , txt
     , id_usr
     , first_value(txt) over (partition by id_usr order by NVL2(TXT, NULL, id) DESC) first_one
  from t

Changing RANGE redefines range to include all non-NULL rows in the partition:

with t
as (
  select 450 id, null txt , 3488 id_usr from dual union all
  select 449   , null     , 3488        from dual union all
  select  79   , 'A'      , 3488        from dual union all
  select  78   , 'X'      , 3488        from dual 
)
select id
     , txt
     , id_usr
     , first_value(txt IGNORE NULLS) over (partition by id_usr order by id DESC RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) first_one
  from t

Quassnoi 2009-08-04 17:48:39

Do you mean that the "ignore nulls" clause is not for determinig the "first_value" but it's for the windowing?

FerranB 2009-08-04 17:58:19

@FerranB: not exactly. It selects the first non-`NULL` value from the range sorted by `id`. If you replace `450` with `1` in your example, it will select 'A' for it.

Quassnoi 2009-08-04 18:11:38

I'll better reword it, it's confusing now.

Quassnoi 2009-08-04 18:12:25

I think the point about RANGE is the more fundamental issue.

Dave Costa 2009-08-05 12:45:20

@Dave: they both matter. It would select `NULL` as the `FIRST_VALUE` if not for the `IGNORE NULLS`.

Quassnoi 2009-08-05 12:52:16

"When you specify IGNORE NULLS, the function, well, ignores NULLs." Yes, but only to determine the FIRST_VALUE not the windowing. This have no sense for this question. The problem is that, as you say, the windowing started on current row not on all the partition rows.

FerranB 2009-08-06 15:52:56

`@FerranB`: with `IGNORE NULLS` set and `RANGE / ROWS` omitted, the window starts on the first non-`NULL` row and ends on the current row.

Quassnoi 2009-08-06 16:03:45

ansaurus

tags:

views:

answers:

What's wrong with this FIRST_VALUE query?

related questions