I'm trying to optimise a query for a SQLServerCE database running on a Windows Mobile device. The query is used to identify rows to delete in one table based on no longer being referenced from a different table. The original query used a DISTINCT to identify matching rows:
SELECT TestGroupId, TestNameId, ServiceTypeId
FROM ServiceTypeInspection
WHERE ServiceTypeId NOT IN
(SELECT DISTINCT ServiceTypeId
FROM PurchaseOrder)
A quick google suggested using EXISTS instead via:
SELECT TestGroupId, TestNameId, ServiceTypeId
FROM ServiceTypeInspection AS STI
WHERE NOT EXISTS
(SELECT PurchaseOrderId
FROM PurchaseOrder AS PO
WHERE (ServiceTypeId = STI.ServiceTypeId))
But then I also found suggestions on SO and MSDN to replace both options with a LEFT JOIN that sounded promising:
SELECT TestGroupId, TestNameId, STI.ServiceTypeId ServiceTypeId
FROM ServiceTypeInspection STI
LEFT JOIN PurchaseOrder PO ON STI.ServiceTypeId = PO.ServiceTypeId
WHERE PO.ServiceTypeId IS NULL
When I run these queries as a batch showing the execution plan, the cost relative to batch of the original query is 21%, the NOT EXISTS version is 11%, with the LEFT JOIN responsible for the balance of 68%. SSMS doesn't indicate there are any missing indexes but the fastest, NOT EXISTS query has 25% cost in a table scan. I have non-unique indexes on ServiceTypeId in both tables. Am I reading the output of the execution plan correctly that in this case the NOT EXISTS version is the best approach? Is the table scan a potential bottleneck or is this normal for a non-unique index?
Table definition is:
CREATE TABLE [PurchaseOrder](
[PurchaseOrderId] [uniqueidentifier] NOT NULL,
[ServiceTypeId] [nvarchar](8) NOT NULL,
CONSTRAINT [PK_PurchaseOrder] PRIMARY KEY
(
[PurchaseOrderId]
)
);
CREATE TABLE [ServiceTypeInspection](
[TestGroupId] [int] NOT NULL,
[TestNameId] [nvarchar](10) NOT NULL,
[ServiceTypeId] [nvarchar](8) NOT NULL,
CONSTRAINT [PK_ServiceTypeInspection] PRIMARY KEY
(
[TestGroupId],[TestNameId]
)
);
CREATE INDEX IX_PurchaseOrder_ServiceTypeId ON [PurchaseOrder] (ServiceTypeId);
CREATE INDEX IX_ServiceTypeInspection_ServiceTypeId ON [ServiceTypeInspection] (ServiceTypeId);