Typically, one would simply require the module cgi
, then use CGI::escape(str)
.
require 'cgi'
require 'open-uri'
escaped_page = CGI::escape("Thor_Industries,_Inc.")
url = "http://en.wikipedia.org/wiki/#{escaped_page}"
f = open(url)
However, this doesn't seem to work for your particular instance, and still returns a 403. I'll leave this here for reference, regardless.
Edit: Wikipedia is refusing your requests because it suspects that you are a bot. It would seem that certain pages that are clearly content are granted to you, but those that don't match its "safe" pattern (e.g. those that contain dots or commas) are subject to its screening. If you actually output the content (I did this with Net::HTTP
), you get the following:
Scripts should use an informative User-Agent string with contact information, or they may be IP-blocked without notice.
Providing a user-agent string, however, solves the issue:
open("http://en.wikipedia.org/wiki/Thor_Industries,_Inc.",
"User-Agent" => "Ruby/#{RUBY_VERSION}")