views:

82

answers:

2

I have approximately 250,000 JSON-formatted files, each with one object in it (formatted just how CouchDB likes it with _id). What's the best way to import these into my remote CouchDB server as records?

-I am on a windows xp machine.

-I have internet access but I can't set up a couchDB server on my local machine and have it be WWW accessible (firewall constraints.) so no easy replication.

+4  A: 

I would highly suggest that you look into the bulk doc API in the couchdb wiki: http://wiki.apache.org/couchdb/HTTP_Bulk_Document_API

Basically, you make a POST request to /someDatabase/_bulk_docs that looks like this:

{
  "docs": [
    { "_id": "awsdflasdfsadf", "foo": "bar" },
    { "_id": "cczsasdfwuhfas", "bwah": "there" },
    ...
  ]
}

Just like any other POST request, if you don't include _id properties, couchdb will generate them for you.

You can use this same operation to update a bunch of docs: just include their _rev property. And if you want to delete any of the docs that you are updating, then add a "_deleted": true property to the document.

Cheers.

Sam Bisbee
slight correction to Sam's great answer. For deletions, do _deleted : true, not _delete.
J Chris A
Thanks Chris - I always typo that one. Fixed in the answer.
Sam Bisbee
A: 

Follow on from this question - I've got validated json (http://www.jsonlint.com/) in the format given above but I get

{"error":"bad_content_type","reason":"Content-Type must be application/json"}

when I POST to _bulk_docs

Anyone got any ideas?

Found it - it's a curl thing, you need

curl -X POST http://127.0.0.1:5984/database/_bulk_docs -H "Content-Type: application/json" -d @docs.json
mr calendar