Hey guys I've got a couple of issues with my code.
- I was wondering that I am plotting the results very ineffectively, since the grouping by hour takes ages
- the DB is very simple it contains the tweets, created date and username. It is fed by the twitter gardenhose.
Thanks for your help !
require 'rubygems'
require 'sequel'
require 'gnuplot'
DB = Sequel.sqlite("volcano.sqlite")
tweets = DB[:tweets]
def get_values(keyword,tweets)
my_tweets = tweets.filter(:text.like("%#{keyword}%"))
r = Hash.new
start = my_tweets.first[:created_at]
my_tweets.each do |t|
hour = ((t[:created_at]-start)/3600).round
r[hour] == nil ? r[hour] = 1 : r[hour] += 1
end
x = []
y = []
r.sort.each do |e|
x << e[0]
y << e[1]
end
[x,y]
end
keywords = ["iceland", "island", "vulkan", "volcano"]
values = {}
keywords.each do |k|
values[k] = get_values(k,tweets)
end
Gnuplot.open do |gp|
Gnuplot::Plot.new(gp) do |plot|
plot.terminal "png"
plot.output "volcano.png"
plot.data = []
values.each do |k,v|
plot.data << Gnuplot::DataSet.new([v[0],v[1]]){ |ds|
ds.with = "linespoints"
ds.title = k
}
end
end
end