views:

5531

answers:

10

I'm building the world's simplest library application. All I want to be able to do is scan in a book's UPC (barcode) using a typical scanner (which just types the numbers of the barcode into a field) and then use it to look up data about the book... at a minimum, title, author, year published, and either the Dewey Decimal or Library of Congress catalog number.

The goal is to print out a tiny sticker ("spine label") with the card catalog number that I can stick on the spine of the book, and then I can sort the books by card catalog number on the shelves in our company library. That way books on similar subjects will tend to be near each other, for example, if you know you're looking for a book about accounting, all you have to do is find SOME book about accounting and you'll see the other half dozen that we have right next to it which makes it convenient to browse the library.

There seem to be lots of web APIs to do this, including Amazon and the Library of Congress. But those are all extremely confusing to me. What I really just want is a single higher level function that takes a UPC barcode number and returns some basic data about the book.

+5  A: 

Edit It would be pretty easy if you had ISBN. but converting from UPC to ISBN is not as easy as you'd like.

Here's some javascript code for it from http://isbn.nu where it's done in script

if (indexisbn.indexOf("978") == 0) {
   isbn = isbn.substr(3,9);
   var xsum = 0;
   var add = 0;
   var i = 0;
   for (i = 0; i < 9; i++) {
        add = isbn.substr(i,1);
        xsum += (10 - i) * add;
   }
   xsum %= 11;
   xsum = 11 - xsum;
   if (xsum == 10) { xsum = "X"; }
   if (xsum == 11) { xsum = "0"; }
   isbn += xsum;
}

However, that only converts from UPC to ISBN some of the time.

You may want to look at the barcode scanning project page, too - one person's journey to scan books.

So you know about Amazon Web Services. But that assumes amazon has the book and has scanned in the UPC.

You can also try the UPCdatabase at http://www.upcdatabase.com/item/{UPC}, but this is also incomplete - at least it's growing..

The library of congress database is also incomplete with UPCs so far (although it's pretty comprehensive), and is harder to get automated.

Currently, it seems like you'd have to write this yourself in order to have a high-level lookup that returns simple information (and tries each service)

Philip Rieck
What I was hoping for was if someone already had code that did this, so I didn't have to read 8000 pages of AWS documentation. All the little library applications already do it. There's some way to convert UPCs to ISBNs which I don't understand, either.
Joel Spolsky
+1  A: 

My librarian wife uses http://www.worldcat.org/, but they key off ISBN. If you can scan that, you're golden. Looking at a few books, it looks like the UPC is the same or related to the ISBN.

Oh, these guys have a function for doing the conversion from UPC to ISBN.

sblundy
+31  A: 

There's a very straightforward web based solution over at ISBNDB.com that you may want to look at.

http://isbndb.com/docs/api/

You can be up and running in just a few minutes:

  • register on the site and get a key to use the API
  • try a URL like:

    http://isbndb.com/api/books.xml?access_key={yourkey}&amp;index1=isbn&amp;results=details&amp;value1=9780143038092

The results=details gets additional details including the card catalog number.

As an aside, generally the barcode is the isbn in either isbn10 or isbn13. You just have to delete the last 5 numbers if you are using a scanner and you pick up 18 numbers.

Here's a sample response:

<ISBNdb server_time="2008-09-21T00:08:57Z">
  <BookList total_results="1" page_size="10" page_number="1" shown_results="1">
    <BookData book_id="the_joy_luck_club_a12" isbn="0143038095">
      <Title>The Joy Luck Club</Title>
      <TitleLong/>
      <AuthorsText>Amy Tan, </AuthorsText>
      <PublisherText publisher_id="penguin_non_classics">Penguin (Non-Classics)</PublisherText>
      <Details dewey_decimal="813.54" physical_description_text="288 pages" language="" edition_info="Paperback; 2006-09-21" dewey_decimal_normalized="813.54" lcc_number="" change_time="2006-12-11T06:26:55Z" price_time="2008-09-20T23:51:33Z"/>
    </BookData>
  </BookList>
</ISBNdb>
curtisk
looks extremely promising!
Joel Spolsky
+3  A: 

Sounds like the sort of job one might get a small software company to do for you...

More seriously, there are services that provide an interface to the ISBN catalog, www.literarymarketplace.com.

On worldcat.com, you can create a URL using the ISBN that will take you straight to a book detail page. That page isn't as very useful because it's still HTML scraping to get the data, but they have a link to download the book data in a couple "standard" formats.

For example, their demo book: http://www.worldcat.org/isbn/9780060817084 Has a "EndNote" format download link http://www.worldcat.org/oclc/123348009?page=endnote&amp;client=worldcat.org-detailed_record, and you can harvest the data from that file very easily. That's linked from their own OCLC number, not the ISBN, but the scrape to convert that isn't hard, and they may yet have a good interface to do it.

davenpcj
+1  A: 

Using the web site Library Thing, you can scan in your barcodes (the entire barcode, not just the ISBN - if you have a scanning "wedge" you're in luck) and build your library. (It is an excellent social network - think StackOverflow for book enthusiasts.)

Then, using the TOOLS section, you can export your library. Now you have a text file to import/parse and can create your labels, a card catalog, etc.

Doug L.
I'm scanning books as I put them on the shelves, so I want to print a label instantly after scanning the book. Printing labels later would be a pain as I'd have to figure out what label goes where after the fact. That's why almost none of the library applications I found work for me
Joel Spolsky
+1  A: 

I'm afraid the problem is database access. Companies pay to have a UPC assigned and so the database isn't freely accessible. The UPCdatabase site mentioned by Philip is a start, as is UPCData.info, but they're user entered--which means incomplete and possibly inaccurate.

You can always enter in the UPC to Google and get a hit, but that's not very automated. But it does get it right most of the time.

I thought I remembered Jon Udell doing something like this (e.g., see this), but it was purely ISBN based.

Looks like you've found a new project for someone to work on!

JohnMeyers
+8  A: 

Note: I'm the LibraryThing guy, so this is partial self-promotion.

Take a look at this StackOverflow answer, which covers some good ways to get data for a given ISBN.

http://stackoverflow.com/questions/41469/how-to-fetch-a-book-title-from-an-isbn-number

To your issues, Amazon includes a simple DDC (Dewey); Google does not. The WorldCat API does, but you need to be an OCLC library to use it.

The ISBN/UPC issue is complex. Prefer the ISBN, if you can find them. Mass market paperbacks sometimes sport UPCs on the outside and an ISBN on inside.

LibraryThing members have developed a few pages on the issue and on efforts to map the two.

http://www.librarything.com/wiki/index.php/UPC http://www.librarything.com/wiki/index.php/CueCat:_ISBNs_and_Barcodes

If you buy from Borders your book's barcodes will all be stickered over with their own internal barcodes (called a "BINC"). Most annoyingly whatever glue they use gets harder and harder to remove cleanly over time. I know of no API that converts them. LibraryThing does it by screenscraping.

For an API, I'd go with Amazon. LibraryThing is a good non-API option, resolving BINCs and adding DDC and LCC for books that don't have them by looking at other editions of the "work."

What's missing is the label part. Someone needs to create a good PDF template for that.

+1  A: 

If you're wanting to use Amazon you can implement it easily with LINQ to Amazon.

Simon_Weaver
A: 

Working in the library world we simply connect to the LMS pass in the barcode and hey presto back comes the data. I believe there are a number of free LMS providers - Google for "open source lms".

Note: This probably works off ISBN...

Matt B
A: 

Here you go: God Bless the Internets

cheekygeek