tags:

views:

244

answers:

5

I am looking for a Regex that allows me to validate json.

I am very new to Regex's and i know enough that parsing with Regex is bad but can it be used to validate?

+5  A: 

Because of the recursive nature of JSON (nested {...}-s), regex is not suited to validate it.

Bart Kiers
+3  A: 

Forget it. Really. JSON lends itself to being parsed with regex even less than conformant XML does.

David Dorward
+2  A: 

It's not a complete validation, but in fact RFC4627 that specifiys the syntax also lists a regular expression for a minimal JSON consistency checks, section 6: http://tools.ietf.org/html/rfc4627

  var my_JSON_object = !(/[^,:{}\[\]0-9.\-+Eaeflnr-u \n\r\t]/.test(
         text.replace(/"(\\.|[^"\\])*"/g, ''))) &&
     eval('(' + text + ')');
mario
+1, not a complete check for sure, but a great basic test.
jvenema
+2  A: 

You cannot use a single regular expression to describe every valid JSON string. Because JSON is an irregular language due to objects {} and arrays [] that can be nested arbitrarily but classical regular expressions can’t describe recursive patterns (although there are modern regular expression implementations that can). So there is no (classical) regular expression that can describe any valid JSON string.

But you can use a series of regular expressions to check if a string is JSON. Here’s an example using Javascript:

// array or object
var re = [
    // is array or object
    /^\s*(?:\[.*]|\{.*\})\s*$/,
    // strings and numbers
    /"(?:[^"\\]|\\(?:["\\\/bfnrt]|u[0-9a-fA-F]{4})*)"|-?(?:0|[1-9]\d*)(?:\.\d+)(?:[eE][+-]\d+)?/g,
    // "empty" arrays and objects without values like "[,,,]" and "{:,:,:,:}"
    /\s*(?:\[\s*(?:,\s*)*]|\{(?:\s*:\s*(?:,\s*:\s*)*)?})/g
];
var isArrayOrObject = re[0].test(str), tmp;
// remove strings and numbers
str = str.replace(re[1], "");
// remove "empty" arrays and objects
while ((tmp = str.replace(re[2], "")) !== str) {
    str = tmp;
}
alert(isArrayOrObject && /^\s*$/.test(str));

This basically checks if the whole JSON string is either an array or an object, and when removing strings and numbers, only “empty” arrays and objects without values remain that can then be removed until only whitespace remains:

1. {"foo":"bar", "baz":[0,2]}
2. {:, :[,]}
3. {:, :}
4. ε
Gumbo