views:

40

answers:

1

I'm implementing a Python ontology class that uses a database backend to store and query the ontology. The database schema is fixed (specified in advance), but I don't know what type of database engine is being used. However, I can rely on the fact that the Python interface of the database engine uses the Python DB-API 2.0 (PEP 249). A straightforward idea is to let the user pass a PEP 249-compliant Connection object to the constructor of my ontology, which will then use various hardcoded SQL queries to query the database:

class Ontology(object):
    def __init__(self, connection):
        self.connection = connection

    def get_term(self, term_id):
        cursor = self.connection.cursor()
        query = "SELECT * FROM term WHERE id = %s"
        cursor.execute(query, (term_id, ))
        [...]

My problem is that different database backends are allowed to support different parameter markers in the queries, defined by the paramstyle attribute of the backend module. For instance, if paramstyle = 'qmark', the interface supports the question mark style (SELECT * FROM term WHERE id = ?); paramstyle = 'numeric' means the numeric, positional style (SELECT * FROM term WHERE id = :1); paramstyle = 'format' means the ANSI C format string style (SELECT * FROM term WHERE id = %s). If I want to make my class be able to handle different database backends, it seems that I have to prepare for all the parameter marker styles. This seems to defeat the whole purpose of a common DB API for me as I can't use the same parameterised query with different database backends.

Is there a way around it, and if so, what is the best approach? The DB API does not specify the existence of a generic escaping function with which I can sanitize my values in the query, so doing the escaping manually is not an option. I don't want to add an extra dependency to the project either by using an even higher level of abstraction (SQLAlchemy, for instance).

+1  A: 

Strictly speaking, the problem is not caused by the DB API allowing this, but by the different databases which use different SQL syntaxes. The DB API module passes the exact query string to the database, along with the parameters. "Resolving" the parameter markers is done by the database itself, not by the DB API module.

That means that if you want to solve this, you have to introduce some higher level of abstraction. If you do not want to add extra dependencies, you will have to do it yourself. But rather than manually escaping and substituting, you could try to dynamically replace parameter markers in the query string with the desired parameter markers, based on the paramstyle of the backend module. Then pass the string, WITH parameter markers to the db. For example, you could use '%s' everywhere, and use python string substitution to replace the '%s' with ':1', ':2' etc. if the db uses 'numeric' style, and so on....

Steven
Hmm, I'm not 100% sure about the DB API module passing the exact query string to the database; for instance, `BaseCursor.execute` in the `MySQLdb` module uses `query = query % db.literal(args)` to format the query string explicitly before sending it to the DB engine. This might not be true for other DB engines, though. Anyways, I'm also leaning towards replacing `%s` with other marker styles explicitly, on-the-fly, but I was wondering whether there's a simpler solution. If no one comes up with anything else, I'll happily accept your answer.
Tamás