views:

119

answers:

2

Short story
I have a technical problem with a third-party library at my hands that I seem to be unable to easily solve in a way other than creating a surrogate key (despite the fact that I'll never need it). I've read a number of articles on the Net discouraging the use of surrogate keys, and I'm a bit at a loss if it is okay to do what I intend to do.

Long story
I need to specify a primary key, because I use SQLAlchemy ORM (which requires one), and I cannot just set it in __mapper_args__, since the class is being built with classobj, and I have yet to find a way to reference the field of a not-yet-existing class in the appropriate PK definition argument. Another problem is that the natural equivalent of the PK is a composite key that is too long for the version of MySQL I use (and it's generally a bad idea to use such long primary keys anyway).

+2  A: 

I always make surrogate keys when using ORMs (or rather, I let the ORMs make them for me). They solve a number of problems, and don't introduce any (major) problems.

So, you've done your job by acknowledging that there are "papers on the net" with valid reasons to avoid surrogate keys, and that there's probably a better way to do it.

Now, write "# TODO: find a way to avoid surrogate keys" somewhere in your source code and go get some work done.

Seth
Don't add TODOs if you have no intention of ever doing them. If you think it's a waste of effort now, it'll be a waste of effort in the future, and all you're doing is adding to the list of garbage you'll have to waste through every time you egrep 'TODO|XXX|FIXME'.
Glenn Maynard
Add TODO's to absolve your conscience because you have some guidance from based on "a number of articles on the Net" that you think is important but is clearly inappropriate in this case.
S.Lott
When I get stuck finding the "right" way to do something, I will often [DoTheSimplestThingThatCouldPossiblyWork](http://c2.com/xp/DoTheSimplestThingThatCouldPossiblyWork.html), and come back later during a refactoring cycle to determine whether "right" way is actually necessary. I use TODO: merely because "TODO:" is a convenient way in virtually all IDEs to add visibility to a line of source code. If REVIEW_FOR_ACADEMIC_RIGOR: suits you better, go for it! :)
Seth
A: 

"Using a surrogate key allows duplicates to be created when using a natural key would have prevented such problems" Exactly, so you should have both keys, not just a surrogate. The error you seem to be making is not that you are using a surrogate, it's that you are assuming the table only needs one key. Make sure you create all the keys you need to ensure the integrity of your data.

Having said that, in this case it seems like a deficiency of the ORM software (apparently not being able to use a composite key) is the real cause of your problems. It's unfortunate that a software limitation like that should force you to create keys you don't otherwise need. Maybe you could consider using different software.

dportas
That's not a very practical principle, as so few ORMs support composite primary keys you'd be limiting yourself to a tiny set of software. The next time you hit a software limitation and apply this principle, the intersection of the sets leaves you with no selections at all.
Glenn Maynard
"deficiency of the ORM software (apparently not being able to use a composite key". Hardly. It's a deficiency in use of the ORM for failing to declare a unique index on the composite natural key. Many ORM's support "unique" declarations outside the surrogate key.
S.Lott