I noticed that the jQuery parseJSON basically does a simple regex "check":
parseJSON: function( data ) {
if ( typeof data !== "string" || !data ) {
return null;
}
// Make sure leading/trailing whitespace is removed (IE can't handle it)
data = jQuery.trim( data );
// Make sure the incoming data is actual JSON
// Logic borrowed from http://json.org/json2.js
if ( /^[\],:{}\s]*$/.test(data.replace(/\\(?:["\\\/bfnrt]|u[0-9a-fA-F]{4})/g, "@")
.replace(/"[^"\\\n\r]*"|true|false|null|-?\d+(?:\.\d*)?(?:[eE][+\-]?\d+)?/g, "]")
.replace(/(?:^|:|,)(?:\s*\[)+/g, "")) ) {
// Try to use the native JSON parser first
return window.JSON && window.JSON.parse ?
window.JSON.parse( data ) :
(new Function("return " + data))();
} else {
jQuery.error( "Invalid JSON: " + data );
}
},
If it passes that "check" and if it's a modern browser a native JSON parser is used. Otherwise, I assume for a browser like IE6 a new function is automatically invoked and returns the object.
Question #1: Since this is just a simple regex test, isn't this prone to some sort of obscure edge-case exploit? Shouldn't we really be using a full blown parser, for the browsers that don't support native JSON parsing at least?
Question #2: How much "safer" is (new Function(" return " + data ))()
as opposed to eval("(" + text + ")")
?