Is there a safe subset of JSON that can be used with all JSON parsers & databases

JSON is still becoming more and more important for exchange of data but the JSON specification is rather lax in some aspects:

The names within an object SHOULD be unique.

An implementation may set limits on the size of texts that it accepts. An implementation may set limits on the maximum depth of nesting. An implementation may set limits on the range of numbers. An implementation may set limits on the length and character contents of strings.

I think that most JSON parsers ignore duplicated object keys and do not distinguish between minus zero (-0) and zero. Most may also restrict numbers to 32bit float or signed integer. Moreover JSON is allowed to contain characters that are not valid Unicode code points (see this question). And I bet that may implementations have problems with Unicode characters above the Basic Multilingual Plane (U+0000 through U+FFFF). But is not also the JSON specification, also JSON databases like CouchDB, MongoDB, Persevere/Dojo etc. add restrictions: I doubt you can use object keys like id, _id, and $ref among all JSON stores because they may have special meaning in each system.

This is somehow frustrating: JSON should be easy but the closer you look the more obstacles you find. Is there a common (not too restrict) subset of JSON that can safely be used among all parsers and databases or will the NoSQL movement add more and more extensions and special constructs you should not use in your JSON documents?

ansaurus

tags:

views:

answers:

Is there a safe subset of JSON that can be used with all JSON parsers & databases

related questions