views:

161

answers:

3

Hi

I have GUI that will interact with a postgres database, using psycopg2. I have db connection in a multiprocessing process, and send SQL via a multiprocessing queue, and receive via another queue.

The problem is that the speed is very very slow. A simple select * from a small table (30 rows) can be 1/10th of a second, or can take over one second.

Does any one have any clues as to why it's so slow?

New Information: It works fine on winxp, exact same code, so the intermittent delay is only happening on my linux box (ubuntu 9.10)

More info: Having stubbed out the select it appears it's not the problem.

Here is the main part of the db class.

class DataBase(multiprocessing.Process):

    def __init__(self, conn_data, in_queue, out_queue):
     multiprocessing.Process.__init__(self)
     self.in_queue = in_queue
     self.out_queue = out_queue
     self.conn_data = conn_data
     self.all_ok = True

    def run(self):  
     proc_name = self.name
     self.conn = self.get_connection(self.conn_data)
     print("Running ", self.name)
     while True:
      next_job = self.in_queue.get()
      print("Next Job: ",next_job)
      if next_job is None:
       # Stop Process
       break
      SQL = next_job[0]
      callback = next_job[1]
      result = self.execute(SQL)
      self.out_queue.put((result, callback))
     print("Closing connection ", self.name)
     self.conn.close()
     return

And in the GUI I have this:

def recieve_data(self):
 "Revived data on the queue.  Data is a tuple of the actual data and a calback name."
 if self.recieve_queue.empty() == False:
  data = self.recieve_queue.get()
  callback_name = data[1]
  try:
   callback = getattr(self, callback_name)
   callback(data[0])
  except AttributeError as e:
   util.error_ui(err = e)
  self.check_data_timeout = None
  return False # Stop checking.
 return True  # Have the main loop keep checking for data.

def request_data(self, SQL, callback):
 self.send_queue.put((SQL, callback))
 self.check_data_timeout = gobject.timeout_add(50, self.recieve_data) # Poll the database recieved_queue
A: 

Try to isolate what is taking the time - is it the multiprocessing or the database? For example try calling the database directly from the python interactive shell - the ipython shell has 'time' and 'timeit' commands for measuring things like this. Alternatively stub out DataBase.execute to return canned values, and see what difference it makes.

What about gobject.timeout_add? What is that doing? It is possible that the delay is in there rather than the database or the multiprocessing code.

Dave Kirby
Yes, I have tried that. Just running a 100 selects in a single process takes approx 0.3 seconds (including python start up time) as times from the shell using "time python db_test.py"I will give the canned values a go when I return tonight (no linux boxes at work)The timeout_add starts polling the queue once a request has been sent to the db, it's part of the GUI main loop allowing the user to carry on interacting with the app.
Rob
A: 

Have you tried opening new database connection for each of your processes? It seems to me that you are simply adding overhead trying to reuse them in different processes.

Also, I'm not sure (your sample is to small to deduce) but it looks like you're opening new DB connection for each query... Are you closing connection with self.conn.close() after each query? You're supposed to have one long-lasting connection.

liori
I open a new connection for each process, but hold it open the whole time app is running. This allows different objects to have their own individual transactions. I do create a new cursor for each query.
Rob
Try running your code under profiler.
liori
A: 

It seems to be problems or a bug specific to Ubuntu 9.10

All works fine on Ubuntu 9.04 and win32, even on win32 on a virtual machine hosted on Ubuntu 9.10.

Thanks for all the advice.

Rob