views:

519

answers:

1

We are using an XA JDBC driver in a case where it is not required (read-only work that doesn't participate in a distributed transaction).

Just wondering if there are any known performance gains to be had to switch to the Non-XA JDBC driver - if not it's probably not worth switching?

FWIW we are using MySQL 5.1

+2  A: 

As with all things performance related, the answer is: it depends. Specifically, it depends on exactly how you are using the driver.

The cost of interacting transactionally with a database is divided roughly into: code complexity overhead, communication overhead, sql processing and disk I/O.

Communication overhead differs somewhat between the XA and non-XA cases. All else being equal, an XA transaction carries a little more cost here as it requires more round trips to the db. For a non-XA transaction in manual commit mode, the cost is at least two calls: the sql operation(s) and the commit. In the XA case it's start, sql operation(s), end, prepare and commit. For your specific use case that will automatically optimize to start, sql operation(s), end, prepare. Not all the calls are of equal cost: the data moved in the result set will usually dominate. On a LAN the cost of the additional round trips is not usually significant.

Note however that there are some interesting gotchas lurking in wait for the unwary. For example, some drivers don't support prepared statement caching in XA mode, which means that XA usage carries the added overhead of re-parsing the SQL on every call, or requires you to use a separate statement pool on top of the driver. Whilst on the topic of pools, correctly pooling XA connections is a little more complex than pooling non-XA ones, so depending on the connection pool implementation you may see a slight hit there too. Some ORM frameworks are particularly vulnerable to connection pooling overhead if they use aggressive connection release and reacquire within transaction scope. If possible, configure to grab and hold a connection for the lifetime of the tx instead of hitting the pool multiple times.

With the caveat mentioned previously regarding the caching of prepared statements, there is no material difference in the cost of the sql handling between XA and non-XA tx. There is however a small difference to resource usage on the db server: in some cases it may be possible for the server to release resources sooner in the non-XA case. However, transactions are normally short enough that this is not a significant consideration.

Now we consider disk I/O overhead. Here we are concerned with I/O occasioned by the XA protocol rather than the SQL used for the business logic, as the latter is unchanged in either case. For read-only transactions the situation is simple: a sensible db and tx manager won't do any log writes, so there is no overhead. For write cases the same is true where the db is the only resource involved, due to XA's one phase commit optimization. For the 2PC case each db server or other resource manager needs two disk writes instead of the one used in non-XA cases, and the tx manager likewise needs two. Thanks to the slowness of disk storage this is the dominant source of performance overhead in XA vs. non-XA.

Several paragraphs back I mentioned code complexity. XA requires slightly more code execution than non-XA. In most cases the complexity is buried in the transaction manager, although you can of course drive XA directly if you prefer. The cost is mostly trivial, subject to the caveats already mentioned. Unless you are using a particularly poor transaction manager, in which case you may have a problem. The read-only case is particularly interesting - transaction manager providers usually put their optimization effort into the disk I/O code, whereas lock contention is a more significant issue for read-only use cases, particularly on highly concurrent systems.

Note also that code complexity in the tx manager is something of a red-herring in architectures featuring an app server or other standard transaction manager provider, as these usually use much the same code for XA and non-XA transaction coordination. In non-XA cases, to miss out the tx manager entirely you typically have to tell the app server / framework to treat the connection as non-transactional and then drive the commit directly using JDBC.

So the summary is: the cost of your sql queries is going to dominate the read-only transaction time regardless of the XA/non-XA choice, unless you mess up something in the configuration or do particularly trivial sql operations in each tx, the latter being a sign your business logic could probably use some restructuring to change the ratio of tx management overhead to business logic in each tx.

For read-only cases the usual transaction protocol agnostic advise therefore applies: consider a transaction aware level second level cache in an ORM solution rather than hitting the DB each time. Failing that, tune the sql, then increase the db's buffer cache until you see a 90%+ hit rate or you max out the server's RAM slots, whichever comes first. Only worry about XA vs. non-XA once you've done that and found things are still too slow.

Jonathan Halliday