views:

359

answers:

3

I have a text column that should only have 1 of 3 possible strings. To put a constraint on it, I would have to reference another table. Can I instead put the values of the constraint directly on the column without referring to another table?

+5  A: 

If this is SQL Server, Oracle, or PostgreSQL, yes, you can use a check constraint.

If it's MySQL, check constraints are recognized but not enforced. You can use an enum, though. If you need a comma-separated list, you can use a set.

However, this is generally frowned upon, since it's definitely not easy to maintain. Just best to create a lookup table and ensure referential integrity through that.

Eric
"However, this is generally frowned upon, since it's definitely not optimized." - any chance you can provide a link to an explanation? I'm assuming you mean constraints impact the performance of updates/inserts - and a lookup table is certainly a better solution for the original question - but I'd like to see more on the performance impact.
Mayo
`@mmayo`: a `CHECK` constraint is harder to maintain than a `FOIREGN KEY` lookup table, but `CHECK` is more efficient than a lookup table in all systems but `MySQL` (which does not support `CHECK` constraints). In `MySQL`, a lookup table can be more efficient than an `IN` condition with a range scan on a long list of predicates. See this article in my blog for performance detail: http://explainextended.com/2009/08/18/passing-parameters-in-mysql-in-list-vs-temporary-table/
Quassnoi
`@Eric`: for your example (with but three possible values) a lookup table will be less efficient even in `MySQL`. It pays for itself only for very long lists which should be kept in a lookup table for reasons of maintainability, not performance.
Quassnoi
@Quassnoi: I tested this out and saw 1 or 2 ms difference on SQL Server 2008 for such a small set. At that level, any optimization is a micro one, and maintainability should win.
Eric
`@Eric`: exactly.
Quassnoi
`@Eric`: however, even on small lookup tables of but `2` records performance can matter if they are looked up by a huge referencing table. For instance, defining an index on such a small table which can hardly be ever noticed with a plain query can boost performance when it's used in a `JOIN` with a large table. See here for details: http://explainextended.com/2009/06/22/indexing-tiny-tables/
Quassnoi
+2  A: 

In Oracle, SQL Server and PostgreSQL, use CHECK constraint.

CREATE TABLE mytable (myfield INT VARCHAR(50) CHECK (myfield IN ('first', 'second', 'third'))

In MySQL, use ENUM datatype:

CREATE TABLE mytable (myfield ENUM ('first', 'second', 'third'))
Quassnoi
+2  A: 

In addition to the CHECK constraint and ENUM data type that other mention, you could also write a trigger to enforce your desired restriction.

I don't necessarily recommend a trigger as a good solution, I'm just pointing out another option that meets your criteria of not referencing a lookup table.

My habit is to define lookup tables instead of using constraints or triggers, when the rule is simply to restrict a column to a finite set of values. The performance impact of checking against a lookup table is no worse than using CHECK constraints or triggers, and it's a lot easier to manage when the set of values might change from time to time.

Also a common task is to query the set of permitted value, for instance to populate a form field in the user interface. When the permitted values are in a lookup table, this is a lot easier than when they're defined in a list of literal values in a CHECK constraint or ENUM definition.


Re comment "how exactly to do lookup without id"

CREATE TABLE LookupStrings (
  string VARCHAR(20) PRIMARY KEY
);

CREATE TABLE MainTable (
  main_id INT PRIMARY KEY,
  string VARCHAR(20) NOT NULL,
  FOREIGN KEY (string) REFERENCES LookupStrings (string)
);

Now you can be assured that no value in MainTable.string is invalid, since the referential integrity prevents that. But you don't have to join to the LookupStrings table to get the string, when you query MainTable:

SELECT main_id, string FROM MainTable;

See? No join! But you get the string value.


Re comment about multiple foreign key columns:

You can have two individual foreign keys, each potentially pointing to different rows in the lookup table. The foreign key column doesn't have to be named the same as the column in the referenced table.

My common example is a bug-tracking database, where a bug was reported by one user, but assigned to be fixed by a different user. Both reported_by and assigned_to are foreign keys referencing the Accounts table.

CREATE TABLE Bugs (
  bug_id INT PRIMARY KEY,
  reported_by INT NOT NULL,
  assigned_to INT,
  FOREIGN KEY (reported_by) REFERENCES Accounts (account_id),
  FOREIGN KEY (assigned_to) REFERENCES Accounts (account_id)
);
Bill Karwin
so i would keep the id (ints) in my column and refer to that id in another table?
zsharp
No, I don't do the id thing when I use lookup tables. If I store the actual value in my main table, there's no need to join the lookup table to get the value.
Bill Karwin
how exactly to do lookup without id
zsharp
See edit above.
Bill Karwin
i just saw your older post on this and did it. But What if I have 2 columns in main table that need to be constrained by same lookup strings? Do I have to have separate lookup tables?
zsharp
See additional edit above.
Bill Karwin
one note, with 2 foreign keys sql server will not permit cascading changes to both columns. correct?
zsharp
No, I don't believe such a limitation exists.
Bill Karwin