views:

977

answers:

9

I have a SQL-Server 2008 database and a schema which uses foreign key constraints to enforce referential integrity. Works as intended. Now the user creates views on the original tables to work on subsets of the data only. My problem is that filtering certain datasets in some tables but not in others will violate the foreign key constraints.
Imagine two tables "one" and "two". "one" contains just an id column with values 1,2,3. "Two" references "one". Now you create views on both tables. The view for table "two" doesn't filter anything while the view for table "one" removes all rows but the first. You'll end up with entries in the second view that point nowhere.

Is there any way to avoid this? Can you have foreign key constraints between views?

Some Clarification in response to some of the comments:
I'm aware that the underlying constraints will ensure integrity of the data even when inserting through the views. My problem lies with the statements consuming the views. Those statements have been written with the original tables in mind and assume certain joins cannot fail. This assumption is always valid when working with the tables - but views potentially break it.
Joining/checking all constraints when creating the views in the first place is annyoing because of the large number of referencing tables. Thus I was hoping to avoid that.

+1  A: 

Something like this in View2 is probably your best bet:

CREATE VIEW View2
AS
     SELECT
          T2.col1,
          T2.col2,
          ...
     FROM
          Table2 T2
     INNER JOIN Table1 T1 ON
          T1.pk = T2.t1_fk
Tom H.
yeah, I'm aware of this solution but was hoping it could be avoided. Thanks.
BuschnicK
If T-SQL hypothetically had such a feature for filtering one view based on another, I think Tom's answer would be the closest to what SQL server would be doing to produce the results.
AaronLS
+2  A: 

If you try to insert, update or delete data through a view, the underlying table constraints still apply.

Damir Sudarevic
+1 for good answer. You cannot alter or modify all columns visible in view. Only those columns which do not affect any constraint, whether it is check constraint or any other u can modify that column
Shantanu Gupta
A: 

You could stage the filtered table 1 data to another table. The contents of this staging table are your view 1, and then you build view 2 via a join of the staging table and table 2. This way the proccessing for filtering table 1 is done once and reused for both views.

Really what it boils down to is that view 2 has no idea what kind of filtering you performed in view 1, unless you tell view 2 the filtering criteria, or make it somehow dependent on the results of view 1, which means emulating the same filtering that occurs on view1.

Constraints don't perform any kind of filtering, they only prevent invalid data, or cascade key changes and deletes.

AaronLS
A: 

Another approach, depending on your requirements, would be to use a stored procedure to return two recordsets. You pass it filtering criteria and it uses the filtering criteria to query table 1, and then those results can be used to filter the query to table 2 so that it's results are also consistent. Then you return both results.

AaronLS
A: 

No, you can't create foreign keys on views.

Even if you could, where would that leave you? You would still have to declare the FK after creating the view. Who would declare the FK, you or the user? If the user is sophisticated enough to declare a FK, why couldn't he add an inner join to the referenced view? eg:

create view1 as select a, b, c, d from table1 where a in (1, 2, 3)
go 
create view2 as select a, m, n, o from table2 where a in (select a from view1)
go 

vs:

create view1 as select a, b, c, d from table1 where a in (1, 2, 3)
go 
create view2 as select a, m, n, o from table2
--# pseudo-syntax for fk:
alter view2 add foreign key (a) references view1 (a)
go 

I don't see how the foreign key would simplify your job.

Alternatively:

Copy the subset of data into another schema or database. Same tables, same keys, less data, faster analysis, less contention.

If you need a subset of all the tables, use another database. If you only need a subset of some tables, use a schema in the same database. That way your new tables can still reference the non-copied tables.

Then use the existing views to copy the data over. Any FK violations will raise an error and identify which views require editing. Create a job and schedule it daily, if necessary.

Peter
+3  A: 

Peter already hit on this, but the best solution is to:

  1. Create the "main" logic (that filtering the referenced table) once.
  2. Have all views on related tables join to the view created for (1), not the original table.

I.e.,

CREATE VIEW v1 AS SELECT * FROM table1 WHERE blah

CREATE VIEW v2 AS SELECT * FROM table2 WHERE EXISTS
  (SELECT NULL FROM v1 WHERE v1.id = table2.FKtoTable1)

Sure, syntactic sugar for propagating filters for views on one table to views on subordinate tables would be handy, but alas, it's not part of the SQL standard. That said, this solution is still good enough -- efficient, straightforward, maintainable, and guarantees the desired state for the consuming code.

richardtallent
Yes, I'd do it like this if I needed a feature like this.
Roland Bouman
+3  A: 

It took me some time to figure out the misunderstaning here -- not sure if I still understand completely, but here it is. I will use an example, close to yours, but with some data -- easier for me to think in these terms.

So first two tables; A = Department B = Employee

CREATE TABLE Department
  ( 
   DepartmentID int PRIMARY KEY
  ,DepartmentName varchar(20)
  ,DepartmentColor varchar(10)
  )
GO 
CREATE TABLE Employee
  ( 
   EmployeeID int PRIMARY KEY
  ,EmployeeName varchar(20)
  ,DepartmentID int FOREIGN KEY REFERENCES Department ( DepartmentID )
  )
GO 

Now I'll toss some data in

INSERT  INTO Department
  ( DepartmentID, DepartmentName, DepartmentColor )
 SELECT 1, 'Accounting', 'RED' UNION
 SELECT 2, 'Engineering', 'BLUE' UNION
 SELECT 3, 'Sales', 'YELLOW'  UNION
 SELECT 4, 'Marketing', 'GREEN' ;

INSERT  INTO Employee
  ( EmployeeID, EmployeeName, DepartmentID )
 SELECT 1, 'Lyne', 1 UNION
 SELECT 2, 'Damir', 2 UNION
 SELECT 3, 'Sandy', 2 UNION
 SELECT 4, 'Steve', 3 UNION
 SELECT 5, 'Brian', 3 UNION
 SELECT 6, 'Susan', 3 UNION
    SELECT 7, 'Joe', 4 ;

So, now I'll create a view on the first table to filter some departments out.

CREATE VIEW dbo.BlueDepartments
AS
SELECT * FROM dbo.Department
WHERE DepartmentColor = 'BLUE'
GO

This returns

DepartmentID DepartmentName       DepartmentColor
------------ -------------------- ---------------
2            Engineering          BLUE

And per your example, I'll add a view for the second table which does not filter anything.

CREATE VIEW dbo.AllEmployees
AS
SELECT * FROM dbo.Employee
GO

This returns

EmployeeID  EmployeeName         DepartmentID
----------- -------------------- ------------
1           Lyne                 1
2           Damir                2
3           Sandy                2
4           Steve                3
5           Brian                3
6           Susan                3
7           Joe                  4

It seems to me that you think that Employee No 5, DepartmentID = 3 points to nowhere?

"You'll end up with entries in the second view that point nowhere."

Well, it points to the Department table DepartmentID = 3, as specified with the foreign key. Even if you try to join view on view nothing is broken:

SELECT  e.EmployeeID
       ,e.EmployeeName
       ,d.DepartmentID
       ,d.DepartmentName
       ,d.DepartmentColor
FROM    dbo.AllEmployees AS e
        JOIN dbo.BlueDepartments AS d ON d.DepartmentID = e.DepartmentID
        ORDER BY e.EmployeeID

Returns

EmployeeID  EmployeeName         DepartmentID DepartmentName       DepartmentColor
----------- -------------------- ------------ -------------------- ---------------
2           Damir                2            Engineering          BLUE
3           Sandy                2            Engineering          BLUE   

So nothing is broken here, the join simply did not find matching records for DepartmentID <> 2 This is actually the same as if I join tables and then include filter as in the first view:

SELECT  e.EmployeeID
       ,e.EmployeeName
       ,d.DepartmentID
       ,d.DepartmentName
       ,d.DepartmentColor
FROM    dbo.Employee AS e
        JOIN dbo.Department AS d ON d.DepartmentID = e.DepartmentID
        WHERE d.DepartmentColor = 'BLUE'
     ORDER BY e.EmployeeID

Returns again:

EmployeeID  EmployeeName         DepartmentID DepartmentName       DepartmentColor
----------- -------------------- ------------ -------------------- ---------------
2           Damir                2            Engineering          BLUE
3           Sandy                2            Engineering          BLUE

In both cases joins do not fail, they simply do as expected.

Now I will try to break the referential integrity through a view (there is no DepartmentID= 127)

INSERT  INTO dbo.AllEmployees
      ( EmployeeID, EmployeeName, DepartmentID )
VALUES( 10, 'Bob', 127 )

And this results in:

Msg 547, Level 16, State 0, Line 1
The INSERT statement conflicted with the FOREIGN KEY constraint "FK__Employee__Depart__0519C6AF". The conflict occurred in database "Tinker_2", table "dbo.Department", column 'DepartmentID'.

If I try to delete a department through the view

DELETE FROM dbo.BlueDepartments
WHERE DepartmentID = 2

Which results in:

Msg 547, Level 16, State 0, Line 1
The DELETE statement conflicted with the REFERENCE constraint "FK__Employee__Depart__0519C6AF". The conflict occurred in database "Tinker_2", table "dbo.Employee", column 'DepartmentID'.

So constraints on underlying tables still apply.

Hope this helps, but then maybe I misunderstood your problem.

Damir Sudarevic
Problem stated perfectly. And yes, if the consuming queries used inner joins there would be no problem. They don't however. Since the foreign key may be null left joins are used. Which leads to the situation that you may get null columns due to the foreign key being null (which is intended) or null columns due to the referenced view not containing the requested rows (not intended and what I meant by "join failing").
BuschnicK
+2  A: 

I love your question. It screams of familiarity with the Query Optimizer, and how it can see that some joins are redundant if they serve no purpose, or if it can simplify something knowing that there is at most one hit on the other side of a join.

So, the big question is around whether you can make a FK against the CIX of an Indexed View. And the answer is no.

create table dbo.testtable (id int identity(1,1) primary key, val int not null);
go
create view dbo.testview with schemabinding as
select id, val
from dbo.testtable
where val >= 50
;
go
insert dbo.testtable
select 20 union all
select 30 union all
select 40 union all
select 50 union all
select 60 union all
select 70 
go
create unique clustered index ixV on dbo.testview(id);
go
create table dbo.secondtable (id int references dbo.testview(id));
go

All this works except for the last statement, which errors with:

Msg 1768, Level 16, State 0, Line 1
Foreign key 'FK__secondtable__id__6A325CF7' references object 'dbo.testview' which is not a user table.

So the Foreign key must reference a user table.

But... the next question is about whether you could reference a unique index that is filtered in SQL 2008, to achieve a view-like FK.

And still the answer is no.

create unique index ixUV on dbo.testtable(val) where val >= 50;
go

This succeeded.

But now if I try to create a table that references the val column

create table dbo.thirdtable (id int identity(1,1) primary key, val int not null check (val >= 50) references dbo.testtable(val));

(I was hoping that the check constraint that matched the filter in the filtered index might help the system understand that the FK should hold)

But I get an error saying:

There are no primary or candidate keys in the referenced table 'dbo.testtable' that matching the referencing column list in the foreign key 'FK__thirdtable__val__0EA330E9'.

If I drop the filtered index and create a non-filtered unique non-clustered index, then I can create dbo.thirdtable without any problems.

So I'm afraid the answer still seems to be No.

Rob Farley
From a data integrity perspective, you are going to end up being stuck with triggers and constraints. From an optimisation perspective, I'm not sure you can do much. I'm writing a blog post for you now at msmvps.com/blogs/robfarley. It'll be published tomorrow.
Rob Farley
A: 

From a purely data integrity perspective (and nothing to do with the Query Optimizer), I had considered an Indexed View. I figured you could make a unique index on it, which could be broken when you try to have broken integrity in your underlying tables.

But... I don't think you can get around the restrictions of indexed views well enough.

For example:

You can't use outer joins, or sub-queries. That makes it very hard to find the rows that don't exist in the view. If you use aggregates, you can't use HAVING, so that cuts out some options you could use there too. You can't even have constants in an indexed view if you have grouping (whether or not you use a GROUP BY clause), so you can't even try putting an index on a constant field so that a second row will fall over. You can't use UNION ALL, so the idea of having a count which will break a unique index when it hits a second zero won't work.

I feel like there should be an answer, but I'm afraid you're going to have to take a good look at your actual design and work out what you really need. Perhaps triggers (and good indexes) on the tables involved, so that any changes that might break something can roll it all that.

But I was really hoping to be able to suggest something that the Query Optimizer might be able to leverage to help the performance of your system, but I don't think I can.

Rob Farley