views:

110

answers:

5

Using sqlite i need a table to hold a blob to store a md5 hash and a 4byte int. I plan to index the int but this value will not be unique.

Do i need a primary key for this table? and is there an issue with indexing a non unique value? (I assume there is not issue or reason for any).

+11  A: 

Personally, I like to have a unique primary id on all tables. It makes finding unique records for updating/deleting easier.

David Stratton
+1, but expect of join tables.
BalusC
+1: Matches the ORM expectation of an immutable unique id for each row.
S.Lott
Except Many to Many link tables
Alex Bagnolini
@Alex: Actually, I like to have an extra id field on many-to-many link tables too. Sometimes it turns out to be useful to be able to print out the links in order of having been added, even if you originally though that it wouldn't matter, so an automatically incremented primary key is very useful. Also, when implementing delete functionality in an application (to remove the link), it is sometimes easier if you can refer to the row with a simple id.
Thomas Padron-McCarthy
+3  A: 

How are you going to reference on a SELECT * FROM Table WHERE or an UPDATE ... WHERE? Are you sure you want each one?

Daniel A. White
+2  A: 

My advice is to go with primary key if you want to have referential integrity. However there is no issue with indexing a non unique value. The only thing is that your performance will downgrade a little.

Siblja
+3  A: 

You already have one.

SQLite automatically creates an integer ROWID column for every row of every table. This can function as a primary key if you don't declare your own.

In general it's a good idea to declare your own primary key column. In the particular instance you mentioned, ROWID will probably be fine for you.

Larry Lustig
+1. Good answer...
David Stratton
Thanks David. I've only done one project using SQLite, but I really like it as a lightweight but still very, very capable DBMS.
Larry Lustig
+1  A: 

What are the consequences of letting two identical rows somehow get into this table?

One consequence is, of course, wasted space. But I'm talking about something more fundamental, here. There are times when duplicate rows in data give you wrong results. For example, if you grouped by the int column (field), and listed the count of rows in each group, a duplicate row (record) might throw you off, depending on what you are really looking for.

Relational databases work better if they are based on relations. Relations are always in first normal form. The primary reason for declaring a primary key is to prevent the table from getting out of first normal form, and thus not representing a relation.

Walter Mitty
+1. This answer is worth an upvote.
David Stratton