views:

435

answers:

7

I wrote a small Perl script to extract all the values from a JSON formatted string for a given key name (shown below). So, if I set a command line switch for the Perl script to id, then it would return 1,2, and stringVal from the JSON example below. This script does the job, but I want to see how others would solve this same problem using other unix style tools such as awk, sed, or perl itself. Thanks

{
   "id":"1",
   "key2":"blah"
},
{
   "id":"2",
   "key9":"more blah"
},
{
   "id":"stringVal",
   "anotherKey":"even more blah"
}

Excerpt of perl script that extracts JSON values:

my @values;
while(<STDIN>) {
    chomp;
    s/\s+//g; # Remove spaces
    s/"//g; # Remove quotes
    push @values, /$opt_s:([\w]+),?/g; # $opt_s is a command line switch for the key to find
}

print join("\n",@values);
A: 

If you don't mind seeing the quote and colon characters, I would simply use grep:

grep id file.json

Tim Henigan
+2  A: 

gawk

gawk 'BEGIN{
 FS=":"
 printf "Enter key name: "
 getline key < "-"
}
$0~key{
  k=$2; getline ; v = $2
  gsub("\"","",k)
  gsub("\"","",v)
  print k,v
}' file

output

$ ./shell.sh
Enter key name: id
1, blah
2, more blah
stringVal, even more blah

If you just want the id value,

$ key="id"
$ awk -vkey=$key -F":" '$0~key{gsub("\042|,","",$2);print $2}' file
1
2
stringVal
ghostdog74
Nice 2nd example. What does the gsub("\042|,","",$2) do exactly?
Steve
it says do a global substitution on double quote and comma. pls check your ascii table for meaning of \042.
ghostdog74
+8  A: 

use JSON;

jheddings
+2  A: 

Here is a very rough Awk script to accomplish the task:

awk -v k=id -F: '/{|}/{next}{gsub(/^ +|,$/,"");gsub(/"/,"");if($1==k)print $2}' data
  • the -F: specifies ':' as the field separator
  • The -v k=id sets the key you're searching for.
  • lines containing '{' or '}' are skipped.
  • the first gsub gets rid of leading whitespace and trailing commas.
  • The second gsub gets rid of double quotes.
  • Finally, if k matches $1, $2 is printed.

data is the file containing your JSON

iWerner
Great solution. I was hoping for a pure awk answer, as I'm trying to improve with using awk. Thanks.
Steve
@steve, note that the result and your perl result is different.
ghostdog74
Just be careful: this will fail as soon as there are empty lines in the file or the format changes in the slightest
iWerner
Good point. It doesn't return stringVal.
Steve
+1  A: 

sed (provided that file is formatted as above, no more than one entry per line):

KEY=id;cat file|sed -n "s/^[[:space:]]*\"$KEY\":\"//p"|sed 's/".*$//'
catwalk
no need cat. pass the file to the first sed.
ghostdog74
sure no need, it is just more convenient to have the source file name close to the beginning of the script as opposed to being buried in the middle of one-liner
catwalk
+5  A: 

I would strongly suggest using the JSON module. It will parse your json input in one function (and back). It also offers an OOP interface.

Robert Mah
+1  A: 

Why are you parsing the string yourself when there are libraries to do this for you? json.org has JSON parsing and encoding libraries for practically every language you can think of (and probably a few that you haven't). In Perl:

use strict;
use warnings;
use JSON qw(from_json to_json);

# enable slurp mode
local $/;

my $string = <DATA>;
my $data = from_json($string);

use Data::Dumper;
print "the data was parsed as: " . Dumper($data);

__DATA__
[
    {
       "id":"1",
       "key2":"blah"
    },
    {
       "id":"2",
       "key9":"more blah"
    },
    {
       "id":"stringVal",
       "anotherKey":"even more blah"
    }
]

..produces the output (I added a top level array around the data so it would be parsed as one object):

the data was parsed as: $VAR1 = [
          {
            'key2' => 'blah',
            'id' => '1'
          },
          {
            'key9' => 'more blah',
            'id' => '2'
          },
          {
            'anotherKey' => 'even more blah',
            'id' => 'stringVal'
          }
        ];
Ether