views:

96

answers:

4

I am pulling recent commits from github and trying to parse it using ruby. I know that I can parse it manually but I wanted to see if there was some package that could turn this into a hash or another data structure.

commits: 
- parents: 
  - id: 202fb79e8686ee127fe49497c979cfc9c9d985d5
  author:
    name: This guy
    login: tguy
    email: [email protected]
  url: a url
  id: e466354edb31f243899051e2119f4ce72bafd5f3
  committed_date: "2010-07-19T13:44:43-07:00"
  authored_date: "2010-07-19T13:33:26-07:00"
  message: |-
    message
- parents: 
  - id: c3c349ec3e9a3990cac4d256c308b18fd35d9606
  author: 
    name: Other Guy
    login: oguy
    email: [email protected]
  url: another url
  id: 202fb79e8686ee127fe49497c979cfc9c9d985d5
  committed_date: "2010-07-19T13:44:11-07:00"
  authored_date: "2010-07-19T13:44:11-07:00"
  message: this is another message
+5  A: 

This is YAML http://ruby-doc.org/core/classes/YAML.html. You can do something like obj = YAML::load yaml_string (and a require 'yaml' at the top of your file, its in the standard libs), and then access it like a nested hash.

YAML is basically used in the ruby world the way people use XML in the java/c# worlds.

Matt Briggs
A: 

Although this isn't exactly what you're looking for, here's some more info on pulling commits. http://develop.github.com/p/commits.html. Otherwise, I think you may just need to manually parse it.

sudo work
+2  A: 

This format is YAML, but you can get the same information in XML or JSON, see General API Information. I'm sure there are libraries to parse those formats in Ruby.

svick
Nokogiri parses XML: http://nokogiri.org/
Jesse J
+4  A: 

Looks like YAML to me. There are parsers for a lot of languages. For example, with the YAML library included with Ruby:

data = <<HERE
commits: 
- parents: 
  - id: 202fb79e8686ee127fe49497c979cfc9c9d985d5
  author:
    name: This guy
    login: tguy
    email: [email protected]
  url: a url
  id: e466354edb31f243899051e2119f4ce72bafd5f3
  committed_date: "2010-07-19T13:44:43-07:00"
  authored_date: "2010-07-19T13:33:26-07:00"
  message: |-
    message
- parents: 
  - id: c3c349ec3e9a3990cac4d256c308b18fd35d9606
  author: 
    name: Other Guy
    login: oguy
    email: [email protected]
  url: another url
  id: 202fb79e8686ee127fe49497c979cfc9c9d985d5
  committed_date: "2010-07-19T13:44:11-07:00"
  authored_date: "2010-07-19T13:44:11-07:00"
  message: this is another message
HERE

pp YAML.load data

It prints:

{"commits"=>
  [{"author"=>{"name"=>"This guy", "login"=>"tguy", "email"=>"[email protected]"},
    "parents"=>[{"id"=>"202fb79e8686ee127fe49497c979cfc9c9d985d5"}],
    "url"=>"a url",
    "id"=>"e466354edb31f243899051e2119f4ce72bafd5f3",
    "committed_date"=>"2010-07-19T13:44:43-07:00",
    "authored_date"=>"2010-07-19T13:33:26-07:00",
    "message"=>"message"},
   {"author"=>
     {"name"=>"Other Guy", "login"=>"oguy", "email"=>"[email protected]"},
    "parents"=>[{"id"=>"c3c349ec3e9a3990cac4d256c308b18fd35d9606"}],
    "url"=>"another url",
    "id"=>"202fb79e8686ee127fe49497c979cfc9c9d985d5",
    "committed_date"=>"2010-07-19T13:44:11-07:00",
    "authored_date"=>"2010-07-19T13:44:11-07:00",
    "message"=>"this is another message"}]}
Chuck
This was great, my only question now is how would I print "This guy" how would I access that if I had assigned morestuff = YAML.load data
Tom
@Tom: You'd just have to go through the tree to the value you want: `morestuff['commits'][0]['author']['name']` (and for "Other Guy" it would be `morestuff['commits'][1]['author']['name']`)
Chuck