views:

418

answers:

3

I have recently been exploring the Tornado web framework to serve a lot of consistent connections by lots of different clients.

I have a request handler that basically takes an RSA encrypted string and decrypts it. The decrypted text is an XML string that gets parsed by a SAX document handler that I have written. Everything works perfectly fine and the execution time (per HTTP request) was roughly 100 milliseconds (with decryption and parsing).

The XML contains the Username and Password hash of the user. I want to connect to a MySQL server to verify that the username matches the password hash supplied by the application.

When I add basically the following code:

conn = MySQLdb.connect (host = "192.168.1.12",
    user = "<useraccount>",
    passwd = "<Password>",
    db = "<dbname>")
 cursor = conn.cursor()

 safe_username = MySQLdb.escape_string(XMLLoginMessage.username)
 safe_pass_hash = MySQLdb.escape_string(XMLLoginMessage.pass_hash)

 sql = "SELECT * FROM `mrad`.`users` WHERE `username` = '" + safe_username + "' AND `password` = '" + safe_pass_hash + "' LIMIT 1;"

 cursor.execute(sql)

            cursor.close()
 conn.close()

The time it takes to execute the HTTP request shoots up to 4 - 5 seconds! I believe this is incurred in the time it takes to connect to the MySql database server itself.

My question is how can I speed this up? Can I declare the MySQL connection in the global scope and access it in the request handlers by creating a new cursor, or will that run into concurrency issues because of the asynchronous design of Tornado?

Basically, how can I not have to incur a new connection to a MySQL server EVERY Http request, so it only takes a fraction of a second instead of multiple seconds to implement.

Also, please note, the SQL server is actually on the same physical machine as the Tornado Web Server instance

Update

I just ran a simple MySQL query through a profiler, the same code below.

The call to 'connections.py' init function took 4.944 seconds to execute alone. That doesn't seem right, does it?

Update 2

I think that running with one connection (or even a few with a very simple DB conn pool) will be fast enough to handle the throughput I'm expecting per tornado web server instance.

If 1,000 clients need to access a query, typical query times being in the thousands of seconds, the unluckiest client would only have to wait one second to retrieve the data.

+1  A: 

An SQL connection should not take 5 seconds. Try to not issue a query and see if that improves your performance - which it should.

The Mysqldb module has a threadsafety of "1", which means the module is thread safe, but connections cannot be shared amongst threads. You can implement a connection pool as an alternative.

Lastly, the DB-API has a parameter replacement form for queries which would not require manually concatenating a query and escaping parameters:

cur.execute("SELECT * FROM blach WHERE x = ? AND y = ?", (x,y))
Yann Ramin
Well, I did not know about the auto-parameterization stuff, that'll be really helpful.I'll look into removing the query tomorrow and see where that leaves me.I think I'm going to have to end up writing a connection pool... just because there's the possibility that the calls to the DB will block other threads. It'll be in testing whether I decide to do that or not.
Crowe T. Robot
A: 

Consider SQLAlchemy, which provides a nicer abstraction over DBAPI and also provides connection pooling, etc. (You can happily ignore its ORM and just use the SQL-toolkit)

(Also, you're not doing blocking database calls in the asynchronous request handlers?)

Alex Brasetvik
I know I'm not doing the Blocking calls in the above code, if that's what you're asking, I just didn't include any in the example.Eventually, there will be, but the connection time is what's getting me.I believe I'm going to give SQLAlchemy a try, so thanks for your answer!
Crowe T. Robot
A: 

Declare it in the base handler, it will be called once per application.

Rick