tags:

views:

481

answers:

5

How could this SQL...

CREATE TABLE NewTable AS
    SELECT A,B,C FROM Table1
    minus
    SELECT A, B, C From Table2

...create a new table with NULL values in column A when neither Table1 or Table2 had NULL values for in column A?

But on the other hand, this SQL...

SELECT * FROM
(
   SELECT A,B,C FROM Table1
    minus
    SELECT A, B, C From Table2
) 
WHERE A IS NULL 

return no rows!

It seems inconsistent!

I think it is a bug in Oracle.

Of course the real SQL is much more complex but I believe this accurately illustrates the nature of the problem.

UPDATE

Here's the ACTUAL SQL:

I executed this statement:

CREATE TABLE MyMinus
AS
select 
*
FROM
---begin main query 
(
SELECT expenditure_item_date, expenditure_org, expenditure_type,
       f_amount_billed, f_amount_billed_fc, f_amount_billed_us,
       f_bl_creation_date, f_catalog_source, f_catalog_type, f_company,
       f_company_code, f_cost_center_num, f_cuic, f_currency_code,
       f_destination_type_code, f_distribution_id, f_distribution_num,
       f_exchange_rate, f_extract_date, f_gl_account,
       f_isms_jamis_project_num, f_line_id, f_local_use, f_location_num,
       f_need_by_date, f_org_id, f_po_line_num, f_po_num, f_po_release_num,
       f_project, f_project_num, f_promised_date, f_quantity_billed,
       f_quantity_cancelled, f_quantity_delivered, f_quantity_ordered,
       f_rel_approved_flag, f_rel_cancelled_flag, f_rel_cancel_date,
       f_rel_closed_code, f_rel_hold_flag, f_rel_revision_num, f_task_num
  FROM dw_mgr.po_distributions_curr_fct a
 WHERE EXISTS (
          SELECT 1
            FROM dw_mgr.po_distributions_curr_fct b,
                 dw_mgr.po_lines_curr_fct,
                 dw_mgr.po_header_curr_fct
           WHERE a.ROWID = b.ROWID
             AND b.f_cuic = dw_mgr.po_lines_curr_fct.f_cuic
             AND b.f_line_id = dw_mgr.po_lines_curr_fct.f_line_id
             AND dw_mgr.po_lines_curr_fct.f_cuic =
                                              dw_mgr.po_header_curr_fct.f_cuic
             AND dw_mgr.po_lines_curr_fct.f_header_id =
                                         dw_mgr.po_header_curr_fct.f_header_id
             AND dw_mgr.po_header_curr_fct.f_header_creation_date <
                                      ADD_MONTHS (TRUNC (SYSDATE, 'YEAR'),
                                                  -48)
             AND dw_mgr.po_header_curr_fct.f_po_status IN
                                                 ('CLOSED', 'FINALLY CLOSED'))

MINUS 

SELECT expenditure_item_date, expenditure_org, expenditure_type,
       f_amount_billed, f_amount_billed_fc, f_amount_billed_us,
       f_bl_creation_date, f_catalog_source, f_catalog_type, f_company,
       f_company_code, f_cost_center_num, f_cuic, f_currency_code,
       f_destination_type_code, f_distribution_id, f_distribution_num,
       f_exchange_rate, f_extract_date, f_gl_account,
       f_isms_jamis_project_num, f_line_id, f_local_use, f_location_num,
       f_need_by_date, f_org_id, f_po_line_num, f_po_num, f_po_release_num,
       f_project, f_project_num, f_promised_date, f_quantity_billed,
       f_quantity_cancelled, f_quantity_delivered, f_quantity_ordered,
       f_rel_approved_flag, f_rel_cancelled_flag, f_rel_cancel_date,
       f_rel_closed_code, f_rel_hold_flag, f_rel_revision_num, f_task_num
  FROM arch_fct.po_distributions_curr_fct a
 WHERE EXISTS (
          SELECT 1
            FROM arch_fct.po_distributions_curr_fct b,
                 arch_fct.po_lines_curr_fct,
                 arch_fct.po_header_curr_fct
           WHERE a.ROWID = b.ROWID
             AND b.f_cuic = arch_fct.po_lines_curr_fct.f_cuic
             AND b.f_line_id = arch_fct.po_lines_curr_fct.f_line_id
             AND arch_fct.po_lines_curr_fct.f_cuic =
                                            arch_fct.po_header_curr_fct.f_cuic
             AND arch_fct.po_lines_curr_fct.f_header_id =
                                       arch_fct.po_header_curr_fct.f_header_id
             AND arch_fct.po_header_curr_fct.f_header_creation_date <
                                      ADD_MONTHS (TRUNC (SYSDATE, 'YEAR'),
                                                  -48)
             AND arch_fct.po_header_curr_fct.f_po_status IN
                                                 ('CLOSED', 'FINALLY CLOSED'))

) 

And then this. Note that rows with NULL values of F_DISTRIBUTION_ID were inserted into the created table.

SELECT COUNT(*) from MyMinus WHERE F_DISTRIBUTION_ID IS NULL

--17 rows

Yet when I execute this:

select 
*
FROM
---begin main query 
(
SELECT expenditure_item_date, expenditure_org, expenditure_type,
       f_amount_billed, f_amount_billed_fc, f_amount_billed_us,
       f_bl_creation_date, f_catalog_source, f_catalog_type, f_company,
       f_company_code, f_cost_center_num, f_cuic, f_currency_code,
       f_destination_type_code, f_distribution_id, f_distribution_num,
       f_exchange_rate, f_extract_date, f_gl_account,
       f_isms_jamis_project_num, f_line_id, f_local_use, f_location_num,
       f_need_by_date, f_org_id, f_po_line_num, f_po_num, f_po_release_num,
       f_project, f_project_num, f_promised_date, f_quantity_billed,
       f_quantity_cancelled, f_quantity_delivered, f_quantity_ordered,
       f_rel_approved_flag, f_rel_cancelled_flag, f_rel_cancel_date,
       f_rel_closed_code, f_rel_hold_flag, f_rel_revision_num, f_task_num
  FROM dw_mgr.po_distributions_curr_fct a
 WHERE EXISTS (
          SELECT 1
            FROM dw_mgr.po_distributions_curr_fct b,
                 dw_mgr.po_lines_curr_fct,
                 dw_mgr.po_header_curr_fct
           WHERE a.ROWID = b.ROWID
             AND b.f_cuic = dw_mgr.po_lines_curr_fct.f_cuic
             AND b.f_line_id = dw_mgr.po_lines_curr_fct.f_line_id
             AND dw_mgr.po_lines_curr_fct.f_cuic =
                                              dw_mgr.po_header_curr_fct.f_cuic
             AND dw_mgr.po_lines_curr_fct.f_header_id =
                                         dw_mgr.po_header_curr_fct.f_header_id
             AND dw_mgr.po_header_curr_fct.f_header_creation_date <
                                      ADD_MONTHS (TRUNC (SYSDATE, 'YEAR'),
                                                  -48)
             AND dw_mgr.po_header_curr_fct.f_po_status IN
                                                 ('CLOSED', 'FINALLY CLOSED'))

MINUS 

SELECT expenditure_item_date, expenditure_org, expenditure_type,
       f_amount_billed, f_amount_billed_fc, f_amount_billed_us,
       f_bl_creation_date, f_catalog_source, f_catalog_type, f_company,
       f_company_code, f_cost_center_num, f_cuic, f_currency_code,
       f_destination_type_code, f_distribution_id, f_distribution_num,
       f_exchange_rate, f_extract_date, f_gl_account,
       f_isms_jamis_project_num, f_line_id, f_local_use, f_location_num,
       f_need_by_date, f_org_id, f_po_line_num, f_po_num, f_po_release_num,
       f_project, f_project_num, f_promised_date, f_quantity_billed,
       f_quantity_cancelled, f_quantity_delivered, f_quantity_ordered,
       f_rel_approved_flag, f_rel_cancelled_flag, f_rel_cancel_date,
       f_rel_closed_code, f_rel_hold_flag, f_rel_revision_num, f_task_num
  FROM arch_fct.po_distributions_curr_fct a
 WHERE EXISTS (
          SELECT 1
            FROM arch_fct.po_distributions_curr_fct b,
                 arch_fct.po_lines_curr_fct,
                 arch_fct.po_header_curr_fct
           WHERE a.ROWID = b.ROWID
             AND b.f_cuic = arch_fct.po_lines_curr_fct.f_cuic
             AND b.f_line_id = arch_fct.po_lines_curr_fct.f_line_id
             AND arch_fct.po_lines_curr_fct.f_cuic =
                                            arch_fct.po_header_curr_fct.f_cuic
             AND arch_fct.po_lines_curr_fct.f_header_id =
                                       arch_fct.po_header_curr_fct.f_header_id
             AND arch_fct.po_header_curr_fct.f_header_creation_date <
                                      ADD_MONTHS (TRUNC (SYSDATE, 'YEAR'),
                                                  -48)
             AND arch_fct.po_header_curr_fct.f_po_status IN
                                                 ('CLOSED', 'FINALLY CLOSED'))

) 
WHERE

f_distribution_id is null

I get 0 rows.

Why does insert the records into a temp table appear to introduce rows with NULL DIST IDs?

This minus query SQL, which was generated dynamically by a custom data archival program, attempts to verify that the data which SHOULD be archived in the DW_MGR schema was in fact copied to the ARCH_FCT (archive) schema. It is returning differences which included 17 records where the F_DISTRIBUTION_ID in the MyMinus temp table do not match those in the source DW_MG.PO_DISTRIBUTIONS_CURR_FCT table because they are are NULL. Hence, the archive process is design when differences are found. The question is why are there differences, i.e., how did NULL values get into the MyMinus table when they are not in the SOURCE PO_DISTRIBUTIONS_CURR_FCT table?

EDIT:

Can someone with Oracle META access please post info on thd following Oracle bugs. I was referred to them but I contract located someone in my co who can tell me what our support ID # is. I will find out eventually, but it would be nice to know sooner. If you would rather not post it, consider the following bug references as potentially related info on my question:

Bug 8209309: MINUS IS SHOWING DIFFERENCES WITH CTAS + INSERT 
Bug 7834950: WRONG RESULTS WITH MINUS OPERATOR
+1  A: 

It generally shouldn't.

The only time it might is if you've some advanced security features (fine grained access control) whereby the optimizer can see that A cannot be null in table1/table2 so returns zero rows, but the FGAC kicks in to stop you seeing the actual values in the column by returning null.


EDIT. "With [Virtual Private Database] column-masking behavior, all rows display, even those that reference sensitive columns. However, the sensitive columns display as NULL values. "

http://download.oracle.com/docs/cd/E11882_01/network.112/e10574/vpd.htm#i1014682

Gary
I don't follow you. Please see my update.
Velika
Ah, I see. No, that's not the case here. We do not have such security set up on this table.
Velika
+1  A: 

Firstly, I'd get rid of the ROWID to ROWID join. Then I'd get make the table aiases unique (not reusing 'a' and 'b' in the query above the MINUS and the query below the MINUS).

Finally, I'd look at those 17 rows and try to find the matching records in "dw_mgr.po_distributions_curr_fct" and see, using DUMP(F_DISTRIBUTION_ID) where there is anything odd about the column values.

Gary
Using the DUMP function on the f_distribution_id column displayed a value of "NULL."Using unique aliases did not change the outcome
Velika
If DUMP(f_distribution_id) for those rows in dw_mgr.po_distributions_curr_fct is reporting a NULL then it is because those columns contain a NULL. If there is a constraint, it is probably either NOT ENABLED or NOT VALIDATED.
Gary
+1  A: 

The only way I can think that F_DISTRIBUTION_ID could be NULL when inserted into MyMinus would be if it's returning NULL somehow, someway in the first query.

To reproduce this (on both 9i and 10g):

SQL> INSERT INTO table1 VALUES (NULL, 2, 3);

1 row created.

SQL> INSERT INTO table2 VALUES (1, 2, 3);

1 row created.

SQL> SELECT * 
  2  FROM (
  3    SELECT a, b, c FROM table1
  4    MINUS
  5    SELECT a, b, c FROM table2);

         A          B          C
---------- ---------- ----------
                    2          3

However, with regards to the query returning no rows when run by itself...that's something else. A bug wouldn't surprise me...but have you tried taking out those EXISTS? Of course, there's many different approaches, but perhaps all those sub-queries are causing something funny to happen in memory.

For example:

SELECT expenditure_item_date, expenditure_org, expenditure_type,
       f_amount_billed, f_amount_billed_fc, f_amount_billed_us,
       f_bl_creation_date, f_catalog_source, f_catalog_type, f_company,
       f_company_code, f_cost_center_num, f_cuic, f_currency_code,
       f_destination_type_code, f_distribution_id, f_distribution_num,
       f_exchange_rate, f_extract_date, f_gl_account,
       f_isms_jamis_project_num, f_line_id, f_local_use, f_location_num,
       f_need_by_date, f_org_id, f_po_line_num, f_po_num, f_po_release_num,
       f_project, f_project_num, f_promised_date, f_quantity_billed,
       f_quantity_cancelled, f_quantity_delivered, f_quantity_ordered,
       f_rel_approved_flag, f_rel_cancelled_flag, f_rel_cancel_date,
       f_rel_closed_code, f_rel_hold_flag, f_rel_revision_num, f_task_num
  FROM dw_mgr.po_distributions_curr_fct a
       dw_mgr.po_lines_curr_fct,
       dw_mgr.po_header_curr_fct
  WHERE a.f_cuic = dw_mgr.po_lines_curr_fct.f_cuic
    AND a.f_line_id = dw_mgr.po_lines_curr_fct.f_line_id
    AND dw_mgr.po_lines_curr_fct.f_cuic = dw_mgr.po_header_curr_fct.f_cuic
    AND dw_mgr.po_lines_curr_fct.f_header_id = dw_mgr.po_header_curr_fct.f_header_id
    AND dw_mgr.po_header_curr_fct.f_header_creation_date < ADD_MONTHS (TRUNC (SYSDATE, 'YEAR'), -48)
    AND dw_mgr.po_header_curr_fct.f_po_status IN ('CLOSED', 'FINALLY CLOSED')
MINUS 
SELECT expenditure_item_date, expenditure_org, expenditure_type,
       f_amount_billed, f_amount_billed_fc, f_amount_billed_us,
       f_bl_creation_date, f_catalog_source, f_catalog_type, f_company,
       f_company_code, f_cost_center_num, f_cuic, f_currency_code,
       f_destination_type_code, f_distribution_id, f_distribution_num,
       f_exchange_rate, f_extract_date, f_gl_account,
       f_isms_jamis_project_num, f_line_id, f_local_use, f_location_num,
       f_need_by_date, f_org_id, f_po_line_num, f_po_num, f_po_release_num,
       f_project, f_project_num, f_promised_date, f_quantity_billed,
       f_quantity_cancelled, f_quantity_delivered, f_quantity_ordered,
       f_rel_approved_flag, f_rel_cancelled_flag, f_rel_cancel_date,
       f_rel_closed_code, f_rel_hold_flag, f_rel_revision_num, f_task_num
  FROM arch_fct.po_distributions_curr_fct a,
       arch_fct.po_lines_curr_fct,
       arch_fct.po_header_curr_fct
 WHERE a.f_cuic = arch_fct.po_lines_curr_fct.f_cuic
   AND a.f_line_id = arch_fct.po_lines_curr_fct.f_line_id
   AND arch_fct.po_lines_curr_fct.f_cuic = arch_fct.po_header_curr_fct.f_cuic
   AND arch_fct.po_lines_curr_fct.f_header_id = arch_fct.po_header_curr_fct.f_header_id
   AND arch_fct.po_header_curr_fct.f_header_creation_date < ADD_MONTHS (TRUNC (SYSDATE, 'YEAR'), -48)
   AND arch_fct.po_header_curr_fct.f_po_status IN ('CLOSED', 'FINALLY CLOSED')   
Mark B.
You are correct. The first query is returning NULLs. where I execute CREATE MyMinus1 AS {First Query}, I get nulls in the F_DIST_ID of the resulting table. Yet the source table has no NULLs and if I were to execute: SELECT * FROM ({First Query}) WHERE F_DISTRIBUTION_ID IS NULL, no rows are returned. It is almost as though the CREATE TABLE clause is responsible for introducing the NULLS instead of simply putting the results into a table. How can the first query return NULLS if we are only doing INNER JOINS and the source po_distributions_curr_fct table has no null values for F_DISTRIBUTION_ID?
Velika
BTW, I also agree that taking out those "EXISTS" clauses and rewriting it as you indicated seemed wise. I tried it but same result.
Velika
+1  A: 

Quit breaking your chops. It's an Oracle bug. I'll prove it to ya:

First of all, it has to be the first SQL that is returning NULLS for DISTRIBUTION ID, so isolate that SQL and let's call it "SQL1."

OK, Let's simplify SQL1 for discussion sake and say that it is of this format:

CREATE TABLE TempTable AS 
SELECT
   F_DISTRIBUTION_ID,
   FIELD2,
   FIELD3,...FIELD99

FROM WHATEVER 
WHERE WHATEVER

Then, you are finding that when you execute this, you are finding rows that have a NULL DIST ID:

SELECT COUNT(*) FROM TempTable WHERE F_DISTRIBUTION_ID IS NULL
--Some positive number of rows returned.

If Oracle wasn't a piece of crap, you could change the number of selected fields so that only F_DISTRIBUTION_ID was selected and you would get the same result when you counted the number of rows with a NULL value of F_DISTRIBUTION_ID, right? Right! But that ain't the case, 'cause Oracle is an unreliable dinosaur.

Try this:

CREATE TABLE TempTable AS 
SELECT
   F_DISTRIBUTION_ID
FROM WHATEVER 
WHERE WHATEVER

SELECT COUNT(*) FROM TempTable WHERE F_DISTRIBUTION_ID IS NULL

I betcha dollars to donuts that you get 0 rows returned.

Now, go call up Microsoft and tell them you want to upgrade to SQL Server 2008 R2.

Velika
Everything that you said checked out. You confirmed my original impression .
Velika
This is really strange - I can't reproduce this across any of the Oracle versions I have access to ...man, whatta bug you've got on your hands!
Mark B.
A: 

I'm having a possibly related issue which started out as an issue with MINUS when using a bunch of complex views and exists. I narrowed it down to a probable optimiser issue - you can work around it by stopping the optimiser from messing around with things based on "F_DISTRIBUTION_ID IS NULL" by using something like "upper(F_DISTRIBUTION_ID) IS NULL".

Its next to impossible to create a simplified test case for a bug report in these cases - it likely only occurs in very specific scenarios (its not like MINUS is going to be completely broken after all). With my issue I couldnt reduce the query at all and have it still occur.

FYI my issue was a query which basically joined a bunch of stuff onto a source table, call it Employee. I had a where clause on the primary key of Employee - if I did a where EmployeeId = foo, it would return an extra row with nulls where they shouldn't be (columns from a table which was inner joined on) - if I did a where upper(EmployeeId) = foo then I would get the correct result. Obviously the EmployeeId value was sourced from the same cell in all rows matching the predicate - so it was clearly a bug.

Mark