ansaurus

Question

extracting content of content attribute in meta tag of a website given a specified value for the name attribute with nokogiri in ruby?

Answer 1

+1 A:

The problem is not with the xpath, as it seems the document does not parse. You can check that with puts doc, it does not contain the full input. It seems to be a problem with parsing comments (I suspect either invalid HTML or a bug in libxml2).

In your case I would use a regular expression as workaround. Given that <meta> tags are simple enough that might work, eg /<meta name="([^"]*)" content="([^"]*)"/

Adrian 2010-01-05 02:00:32

Answer 2

A:

you should change

doc = Nokogiri::HTML(open(url))

to

doc = Nokogiri::HTML(open(url).read)

update: or maybe not :) actually your code works for me, using ruby 1.8.7 / nokogiri 1.4.0

mykhal 2010-01-05 16:24:46

ansaurus

tags:

views:

answers:

extracting content of content attribute in meta tag of a website given a specified value for the name attribute with nokogiri in ruby?

related questions