We have two tables, ActivityForm
and Field
which are given a many-to-many relationship via the ActivityFormField
table. The ActivityFormFieldValidator
table will have a many-to-one relationship with the ActivityFormField
table, so we are giving the ActivityFormField
table an identity column ("ActivityFormFieldId"). The ActivityFormField
table also contains information like the order in which the given field should be displayed on the given activity form.
Expected usage is as follows:
- There will eventually be a lot of activity forms and fields, but each activity form will only have a handful of fields attached to it (maybe a dozen, on average).
- The most common query on this table will be to retrieve all of the entries in
ActivityFormField
table with a givenActivityFormId
, and join that data with theField
andActivityFormFieldValidator
tables. - New
ActivityFormField
entries will be created every so often, but not nearly as often as the query mentioned above.
My initial reaction is to configure the ActivityFormField table with the following keys and indexes:
- Primary Key on the identity column, with a Clustered Index on it.
- A Unique, nonclustered index that includes the
FieldId
andActivityFormId
columns.
In addition to feeling the most "right" or "normal" to me, this satisfies the basic requirements of:
- making sure there is only one
ActivityFormField
entry for any combination ofActivityForm
andField
, - providing an indexed key for
ActivityFormFieldValidator
entries to attach to, and - providing an indexed key on
ActivityFormId
(for the query mentioned earlier)
It also "does no harm" with its clustered index, in the sense that insertion of new ActivityFormField
entries shouldn't cause any reordering of the actual data.
However, given the way that we are planning to use the table, I wonder if there's another combination of keys and indexes and clustering that is likely to be much more efficient?
Update
Having done some additional research and thinking, I'm leaning toward making my table declaration look something like this:
CREATE TABLE [ActivityFormField](
[ActivityFormFieldId] [int] IDENTITY(1,1) NOT NULL,
[ActivityFormId] [int] NOT NULL,
[FieldId] [int] NOT NULL,
[SortOrder] [tinyint] NOT NULL,
CONSTRAINT [PK_ActivityFormField] PRIMARY KEY NONCLUSTERED
(
[ActivityFormFieldId] ASC
),
CONSTRAINT [UK_ActivityFormField_ActivityForm_Field] UNIQUE NONCLUSTERED
(
[ActivityFormId] ASC,
[FieldId] ASC
)
)
ALTER TABLE [ActivityFormField] WITH CHECK ADD CONSTRAINT [FK_ActivityFormField_ActivityForm] FOREIGN KEY([ActivityFormId])
REFERENCES [ActivityForm] ([ActivityFormId])
ALTER TABLE [ActivityFormField] WITH CHECK ADD CONSTRAINT [FK_ActivityFormField_Field] FOREIGN KEY([FieldId])
REFERENCES [Field] ([FieldId])
CREATE UNIQUE CLUSTERED INDEX IX_ActivityFormField_ActivityForm_SortOrder ON [ActivityFormField]
(
[ActivityFormId] ASC,
[SortOrder] ASC
)
CREATE NONCLUSTERED INDEX IX_ActivityFormField_ActivityForm ON [ActivityFormField]
(
[ActivityFormId]
)
CREATE NONCLUSTERED INDEX IX_ActivityFormField_Field ON [ActivityFormField]
(
[FieldId]
)
This approach uses a non-clustered index on the identity field, a clustered index for the criteria and order in which the most commonly-used query will retrieve the data, and a constraint to ensure that only one ActivityFormField
entry exists for any combination of ActivityForm
and Field
. Unique Constraints and Indexes are apparently handled the same way internally, so I chose a constraint to show that its primary purpose is to keep things unique. The other indexes are there "just in case," following the logic HLGEM proposed.