views:

117

answers:

5

This example illustrates a mystery I encountered in an application I am building. The application needs to support an option allowing the user to exercise the code without actually committing changes to the DB. However, when I added this option, I discovered that changes were persisted to the DB even when I did not call the commit() method.

My specific question can be found in the code comments. The underlying goal is to have a clearer understanding of when and why SQLAlchemy will commit to the DB.

My broader question is whether my application should (a) use a global Session instance, or (b) use a global Session class, from which particular instances would be instantiated. Based on this example, I'm starting to think that the correct answer is (b). Is that right? Edit: this SQLAlchemy documentation suggests that (b) is recommended.

import sys

from sqlalchemy import create_engine, Column, Integer, String
from sqlalchemy.orm import sessionmaker
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

class User(Base):
    __tablename__ = 'users'

    id   = Column(Integer, primary_key = True)
    name = Column(String)
    age  = Column(Integer)

    def __init__(self, name, age = 0):
        self.name = name
        self.age  = 0

    def __repr__(self):
        return "<User(name='{0}', age={1})>".format(self.name, self.age)

engine = create_engine('sqlite://', echo = False)
Base.metadata.create_all(engine)

Session = sessionmaker()
Session.configure(bind=engine)

global_session = Session() # A global Session instance.
commit_ages    = False     # Whether to commit in modify_ages().
use_global     = True      # If True, modify_ages() will commit, regardless
                           # of the value of commit_ages. Why?

def get_session():
    return global_session if use_global else Session()

def add_users(names):
    s = get_session()
    s.add_all(User(nm) for nm in names)
    s.commit()

def list_users():
    s = get_session()
    for u in s.query(User): print ' ', u

def modify_ages():
    s = get_session()
    n = 0
    for u in s.query(User):
        n += 10
        u.age = n
    if commit_ages: s.commit()

add_users(('A', 'B', 'C'))
print '\nBefore:'
list_users()
modify_ages()
print '\nAfter:'
list_users()
A: 

Note that the defaults of create_session() are the opposite of that of sessionmaker(): autoflush and expire_on_commit are False, autocommit is True.

Paulo Scardine
Thanks for the reply, but I'm not following how this addresses the question.
FM
Try sessionmaker() instead of create_session().
Paulo Scardine
As far as I can tell, I'm already using `sessionmaker()`, not `create_session()`. Am I missing something?
FM
A: 

global_session is already instantiated when you call modify_ages() and you've already committed to the database. If you re-instantiate global_session after you commit, it should start a new transaction.

My guess is since you've already committed and are re-using the same object, each additional modification is automatically committed.

Scott
+4  A: 

tl;dr - The updates are not actually committed to the database-- they are part of an uncommitted transaction in progress.


I made 2 separate changes to your call to create_engine(). (Other than this one line, I'm using your code exactly as posted.)

The first was

engine = create_engine('sqlite://', echo = True)

This provides some useful information. I'm not going to post the entire output here, but notice that no SQL update commands are issued until after the second call to list_users() is made:

...
After:
xxxx-xx-xx xx:xx:xx,xxx INFO sqlalchemy.engine.base.Engine.0x...d3d0 UPDATE users SET age=? WHERE users.id = ?
xxxx-xx-xx xx:xx:xx,xxx INFO sqlalchemy.engine.base.Engine.0x...d3d0 (10, 1)
...

This is a clue that the data is not persisted, but kept around in the session object.

The second change I made was to persist the database to a file with

engine = create_engine('sqlite:///db.sqlite', echo = True)

Running the script again provides the same output as before for the second call to list_users():

<User(name='A', age=10)>
<User(name='B', age=20)>
<User(name='C', age=30)>

However, if you now open the db we just created and query it's contents, you can see that the added users were persisted to the database, but the age modifications were not:

$ sqlite3 db.sqlite "select * from users"
1|A|0
2|B|0
3|C|0

So, the second call to list_users() is getting its values from the session object, not from the database, because there is a transaction in progress that hasn't been committed yet. To prove this, add the following lines to the end of your script:

s = get_session()
s.rollback()
print '\nAfter rollback:'
list_users()
ma3
Thanks a lot. This looks like the right answer. Unfortunately, that means that my example script isn't demonstrating the problem observed in my full application -- which really is committing changes to a MySQL DB even when I avoid calling `commit()`. Time to do some more investigating!
FM
I would say just using `echo=True` for the create_engine call while debugging would be one of the best tools to narrow down what's happening when.
ma3
+1  A: 

Since you state you are actually using MySQL on the system you are seeing the problem, check the engine type the table was created with. The default is MyISAM, which does not support ACID transactions. Make sure you are using the InnoDB engine, which does do ACID transactions.

You can see which engine a table is using with

show create table users;

You can change the db engine for a table with alter table:

alter table users engine="InnoDB";
Nathan Davis
+1  A: 

1. the example: Just to make sure that (or check if) the session does not commit the changes, it is enough to call expunge_all on the session object. This will most probably prove that the changes are not actually committed:

....
print '\nAfter:'
get_session().expunge_all()
list_users()

2. mysql: As you already mentioned, the sqlite example might not reflect what you actually see when using mysql. As documented in sqlalchemy - MySQL - Storage Engines, the most likely reason for your problem is the usage of non-transactional storage engines (like MyISAM), which results in an autocommit mode of execution.

3. session scope: Although having one global session sounds like a quest for a problem, using new session for every tiny little request is also not a great idea. You should think of a session as a transaction/unit-of-work. I find the usage of the contextual sessions the best of two worlds, where you do not have to pass the session object in the hierarchy of method calls, and at the same time you are given a pretty good safety in the multi-threaded environment. I do use the local session once in a while where I know I do not want to interact with the currently running transaction (session).

van
Thanks. This is very helpful.
FM