views:

405

answers:

3

I appreciate a lot CouchDB attempt to use universal web formats in everything it does: RESTFUL HTTP methods in every interaction, JSON objects, javascript code to customize database and documents.

CouchDB seems to scale pretty well, but the individual cost to make a request usually makes 'relational' people afraid of.

Many small business applications should deal with only one machine and that's all. In this case the scalability talk doesn't say too much, we need more performance per request, or people will not use it.

BERT (Binary ERlang Term http://bert-rpc.org/ ) has proven to be a faster and lighter format than JSON and it is native for Erlang, the language in which CouchDB is written. Could we benefit from that, using BERT documents instead of JSON ones?

I'm not saying just for retrieving in views, but for everything CouchDB does, including syncing. And, as a consequence of it, use Erlang functions instead of javascript ones.

This would modify some original CouchDB principles, because today it is very web oriented. Considering I imagine few people would make their database API public and usually its data is accessed by the users through an application, it would be a good deal to have the ability to configure CouchDB for working faster. HTTP+JSON calls could still be handled by CouchDB, considering an extra cost in these cases because of parsing.

+4  A: 

You can have a look at hovercraft. It provides a native Erlang interface to CouchDB. Combining this with Erlang views, which CouchDB already supports, you can have an sort-of all-Erlang CouchDB (some external libraries, such as ICU, will still need to be installed).

Zed
+1  A: 

CouchDB wants maximum data portability. You can't parse BERT in the browser which means it's not portable to the largest platform in the world so that's kind of a non-starter for base CouchDB. Although there might be a place for BERT in hovercraft as mentioned above.

mikeal
But in a "conventional set-up" the browser and database does not communicate, so why would a browser put requirements on a DB backend?It is only the CouchApps 'n' stuff where the browser needs direct communication with the DB, but that's a different interface... App-to-DB and DB-to-DB (replication) communication should not be affected by that, I believe.
Zed
It's incredibly liberating to be able to make and xhr call directly to the database and get out my data without writing more backend code. There are other benefits like free caching since etags are updated when documents and queries change.
mikeal
Maybe it's cultural, but I don't like the idea of giving to users public access to my database. It is very likely to give them information they should not be able to view, or what's worst, giving them access to change data they were not supposed to modify. If this access control is very clear and easy to maintain in CouchDB, perhaps it's fine.
Victor Rodrigues
But about communicating with the browser, if it is something people want, I see no problem in having a little more time per request parsing from BERT to JSON and vice-versa, when talking directly to client web app. Since webapps are very used to loose some time doing ORM stuff on server side, it would not be a bad tradeoff to give server apps more power skipping the http and json stuff (but keeping RESTful thinking), while client apps gains some nano-lazyness because they're dealing with JSON format.
Victor Rodrigues
normally i would agree with your user access to DB concerns but CouchDB trunk includes some amazing new user/authentication stuff that handles the ACL needs I tend to have when writing webapps.
mikeal
A: 

I think it would be first good to measure how much of overhead is due to JSON processing: JSON handling can be very efficient. For example these results suggest that JSON is the fastest schema-less data format on Java platform (protocol buffer requires strict schema, ditto for avro; kryo is java serializer); and I would assume that same could be done on other platforms too (with erlang; for browsers via native support).

So, "if it ain't broke, don't fix it". JSON is very fast, when properly implemented; and if space usage is concern, it compresses well just like any textual formats.

StaxMan