views:

107

answers:

4

I built an inventory database where ISBN numbers are the primary keys for the items. This worked great for a while as the items were books. Now I want to add non-books. some of the non-books have EANs or ISSNs, some do not.

It's in PostgreSQL with django apps for the frontend and JSON api, plus a few supporting python command-line tools for management. the items in question are mostly books and artist prints, some of which are self-published.

What is nice about using ISBNs as primary keys is that in on top of relational integrity, you get lots of handy utilities for validating ISBNs, automatically looking up missing or additional information on the book items, etcetera, many of which I've taken advantage. some such tools are off-the-shelf (PyISBN, PyAWS etc) and some are hand-rolled -- I tried to keep all of these parts nice and decoupled, but you know how things can get.

I couldn't find anything online about 'private ISBNs' or 'self-assigned ISBNs' but that's the sort of thing I was interested in doing. I doubt that's what I'll settle on, since there is already an apparent run on ISBN numbers.

should I retool everything for EAN numbers, or migrate off ISBNs as primary keys in general? if anyone has any experience with working with these systems, I'd love to hear about it, your advice is most welcome.

+3  A: 

I don't know postgres but normally ISBM would be a unique index key but not the primary. It's better to have an integer as primary/foreign key. That way you only need to add a new field EAN/ISSN as nullable.

the_lotus
+1  A: 

If you're using ISBN-10s, then you definitely should migrate to something else, as those are already deprecated. You can easily take ISBN-10s and turn them into ISBN-13s (see wikipedia), which I think are EAN-compatible (again, see wikipedia), but as the_lotus suggests, it's probably better to have some sort of auto-incrementing integer with no external meaning as the primary key and then index on the EAN/ISBN/etc.

Isaac
right now I support either ISBN-10 or ISBN-13 -- it's whatever the books' barcode scan shows, or what's printed on the cover or flap for older editions. it converts the number to the ISBN-13 variant for some 3rd-party API lookups as necessary. you are correct w/r/t EAN compatibility tho, which is why I am interested in potentially migrating to EAN overall.
fish2000
Part of my reluctance to recommend EAN as primary key is that I'm not sure if it's guaranteed to be unique per product--UPC codes, which are supposed to be a subset of EAN, are not always unique per (food) product.
Isaac
indeed that is good to know. I'm going with integer keys for now, thanks.
fish2000
+2  A: 

I agree with the_lotus, not least because ISBN is a poor choice for primary key

Data wise, it may not be unique enough. If clustered, it's quite wide and non-numeric

Example

gbn
that's a very useful link, thanks.
fish2000
+1  A: 

A simple solution (although arguably whether good) would be to use (isbn,title) or (isbn,author) which should pretty much guarantee uniqueness. Ideology is great but practicality also serves a purpose.

Joshua D. Drake
I'm a big fan of practicality, myself -- but unfortunately you'd be surprised at the variety of approaches the various book-related datasources (amazon, openlibrary, isbndb et al) take w/r/t how fields like 'author' or 'title' are formatted and normalized -- dealing with it can be quite dicey, and it'd be even dicier to predicate the databases' integrity on top of it.
fish2000