Adding some other weird error cases to your input
{ "last_modified": {"type": "/type/datetime", "value": "2008-04-01T03:28:50.625462"},
"type": {"key": "/type/author"},
"name": "National Research Council. Committee on the Scientific and Technologic Base of Puerto Rico"s Economy.",
"key": "/authors/OL2108538A",
"revision": 1,
"has \" escaped quote": 1,
"has \" escaped quotes \"": 1,
"has multiple " internal " quotes": 1,
}
this Perl program that corrects unescaped internal double-quotes using the heuristic that a string's actual closing quote is followed by optional whitespace and either a colon, comma, semicolon, or curly brace
#! /usr/bin/perl -p
s<"(.+?)"(\s*[:,;}])> {
my($text,$terminator) = ($1,$2);
$text =~ s/(?<!\\)"/'/g; # " oh, the irony!
qq["$text"] . $terminator;
}eg;
produces the following output:
$ ./fixdqs input.json
{ "last_modified": {"type": "/type/datetime", "value": "2008-04-01T03:28:50.625462"},
"type": {"key": "/type/author"},
"name": "National Research Council. Committee on the Scientific and Technologic Base of Puerto Rico's Economy.",
"key": "/authors/OL2108538A",
"revision": 1,
"has \" escaped quote": 1,
"has \" escaped quotes \"": 1,
"has multiple ' internal ' quotes": 1,
}
Delta from input to output:
$ diff -ub input.json <(./fixdqs input.json)
--- input.json
+++ /dev/fd/63
@@ -1,9 +1,9 @@
{ "last_modified": {"type": "/type/datetime", "value": "2008-04-01T03:28:50.625462"},
"type": {"key": "/type/author"},
- "name": "National Research Council. Committee on the Scientific and Technologic Base of Puerto Rico"s Economy.",
+ "name": "National Research Council. Committee on the Scientific and Technologic Base of Puerto Rico's Economy.",
"key": "/authors/OL2108538A",
"revision": 1,
"has \" escaped quote": 1,
"has \" escaped quotes \"": 1,
- "has multiple " internal " quotes": 1,
+ "has multiple ' internal ' quotes": 1,
}