ansaurus

Question

Can I get rows from SQLAlchemy that are plain arrays, rather than dictionaries?

Answer 1

A:

I think row.items() is what you're looking for. It returns a list of (key, value) tuples for the row.

DNS 2009-03-17 21:25:26

Sadly, items() is just this: return [(key, getattr(self, key)) for key in self.keys()] ... and __getattr__() calls _get_col(). So it's actually even slower that way.

mike 2009-03-17 22:11:37

Answer 2

+1 A:

Forgive the obvious answer, but why isn't row.x == row.y in your query? For example:

mytable.select().where(mytable.c.x==mytable.c.y)

Should give you a huge performance boost. Read the rest of the documentation.

Ali A 2009-03-17 22:33:20

I guess my oversimplified example is too oversimplified. :) The actual filter is more complex than that.

mike 2009-03-17 22:35:04

Then let's see the real thing please. In general, if you are getting data out of a database, and then iterating it to filter, you are probably doing something heinous.

Ali A 2009-03-17 22:36:55

In actuality, the filter is a complex regular expression.

mike 2009-03-17 22:48:54

Mike: please can I see it? As your question stands, this is the only sensible answer I can see.

Ali A 2009-03-17 22:52:49

I will agree: filtering on the client is rarely the correct answer.

Godeke 2009-03-17 23:00:56

Answer 3

A:

You should post your profiler results as well as stack traces around the '_get_col' call so we know which _get_col is being called. (and whether _get_col really is the bottleneck).

I looked at the sqlalchemy source, looks like it may be calling 'lookup_key' (in engine/base.py) each time and it looks like this caches the column value locally, i guess lazily (via PopulateDict).

You can try bypassing that by directly using row.__props (not recommended since it's private), maybe you can row.cursor, but it looks like you would gain much by bypassing sqlalchemy (except the sql generation) and working directly w/ a cursor.

-- J

2009-03-18 01:51:59

Answer 4

+1 A:

SQLAlchemy proxies all access to the underlying database cursor to map named keys to positions in the row tuple and perform any necessary type conversions. The underlying implementation is quite heavily optimized, caching almost everything. Looking over the disassembly the only ways to further optimize seem to be to throw out extensibility and get rid of a couple of attribute lookups or to resort to dynamic code generation for smaller gains, or to gain more, implement the corresponding ResultProxy and RowProxy classes in C.

Some quick profiling shows that the overhead is around 5us per lookup on my laptop. That will be significant if only trivial processing is done with the data. In those kind of cases it might be reasonable to drop down to dbapi level. This doesn't mean that you have to lose the query building functionality of SQLAlchemy. Just execute the statement as you usually would and get the dbapi cursor from the ResultProxy by accessing result.cursor.cursor. (result.cursor is an SQLAlchemy CursorFairy object) Then you can use the regular dbapi fetchall(), fetchone() and fetchmany() methods.

But if you really are doing trivial processing it might be useful to do it, or at least the filtering part on the database server. You probably lose database portability, but that might not be an issue.

Ants Aasma 2009-05-14 22:37:46

ansaurus

tags:

views:

answers:

Can I get rows from SQLAlchemy that are plain arrays, rather than dictionaries?

related questions