What is the correct way to parse a string using regular expressions in a linux shell script? I wrote the following script to print my SO rep on the console using curl
and sed
(not solely because I'm rep-crazy - I'm trying to learn some shell scripting and regex before switching to linux).
json=$(curl -s http://stackoverflow.com/users/flair/165297.json)
echo $json | sed 's/.*"reputation":"\([0-9,]\{1,\}\)".*/\1/' | sed s/,//
But somehow I feel that sed
is not the proper tool to use here. I heard that grep
is all about regex and explored it a bit. But apparently it prints the whole line whenever a match is found - I am trying to extract a number from a single line of text. Here is a downsized version of the string that I'm working on (returned by curl
).
{"displayName":"Amarghosh","reputation":"2,737","badgeHtml":"\u003cspan title=\"1 silver badge\"\u003e\u003cspan class=\"badge2\"\u003e●\u003c/span\u003e\u003cspan class=\"badgecount\"\u003e1\u003c/span\u003e\u003c/span\u003e"}
I guess my questions are:
- What is the correct way to parse a string using regular expressions in a linux shell script?
- Is
sed
the right thing to use here? - Could this be done using
grep
? - Is there any other command that's more easier/appropriate?