I'm having trouble diagnosing a problem I'm having on my ubuntu scalr/ec2 production environment.
The trouble is apparently randomly, database queries and/or memcache queries will take MUCH longer than they should. I've seen a simple select statement take 130ms or a Memcache fetch take 65ms! It can happen a handful of times per request, causing some requests to take twice as long as they should.
To diagnose the problem, I wrote a very simple script which will just connect to the MySql server and run a query.
require 'mysql'
mysql = Mysql.init
mysql.real_connect('', '', '', '')
max = 0
100.times do
start = Time.now
mysql.query('select * from navigables limit 1')
stop = Time.now
total = stop - start
max = total if total > max
end
puts "Max Time: #{max * 1000}"
mysql.close
This script consistently returned a really high max time, so I eliminated any Rails as the source of the problem. I also wrote the same thing in Python to eliminate Ruby. And indeed the Python one took inordinate amounts of time as well!
Both MySql and Memcache are on their own boxes, so I considered network latency, but watching ping
s and traceroute
ing look normal.
Also running the queries/fetches on the respective machines returns expected times, and I'm running the same version gems on my staging machine without this issue.
I'm really stumped on this one... any thoughts on something I could try to diagnose this? Thanks