views:

461

answers:

6

I am running a webapp inside Webpshere Application Server 6.1. This webapp has a rules kind of engine, where every rule obtains its very own connection from the websphere data source pool. So, I see that when an use case is run, for 100 records of input, about 400-800 connections are obtained from the pool and released back to the pool. I have a feeling that if this engine goes to production, it might take too much time to complete processing.

Is it a bad practice to obtain connections from pool that frequently? What are the overhead costs involved in obtaining connections from pool? My guess is that costs involved should be minimal as pool is nothing but a resource cache. Please correct me if I am wrong.

+3  A: 

The pool doesn't seem to be your problem. The real problem lies in the fact that your "rules engine" doesn't release connections back to the pool before completing the entire calculation. The engine doesn't scale well, so it seems. If the number of database connections somehow depends on the number of records being processed, something is almost always very wrong!

If you manage to get your engine to release connections as soon as possible, it may be that you only need a few connections instead of a few hundred. Failing that, you could use a connection wrapper that re-uses the same connection every time the rules engine asks for one, that somewhat negates the benefits of having a connection pool though...

Not to mention that it introduces many multithreading and transaction isolation issues, if the connections are read-only, it might be an option.

Gerco Dries
I once created a webapp that kept the connection alive all through the computation (passing it from function to function) so that each function won't have to get it's own connection from the pool. But it was a bad idea (as you point out) because there's a lot more going around in those functions than data retrieval. So yes, free your connection as soon as possible.
Cyril Gupta
+5  A: 

Connection pooling keeps your connection alive in anticipation, if another user connects the ready connection to the db is handed over and the database does not have to open a connection all over again.

This is actually a good idea because opening a connection is not just a one-go thing. There are many trips to the server (authentication, retrieval, status, etc) So if you've got a connection pool on your website, you're serving your customers faster.

Unless your website is not visited by people you can't afford not to have a connection pool working for you.

Cyril Gupta
+3  A: 

A connection pool is all about connection re-use.

If you are holding on to a connection at times where you don't need a connection, then you are preventing that connection from being re-used somewhere else. And if you have a lot of threads doing this, then you must also run with a larger pool of connections to prevent pool exhaustion. More connections takes longer to create and establish, and they take more resources to maintain; there will be more reconnecting as the connections grow old and your database server will also be impacted by the greater number of connections.

In other words: you want to run with the smallest possible pool without exhausting it. And the way to do that is to hold on to your connections as little as possible.

I have implemented a JDBC connection pool myself and, although many pool implementations out there probably could be faster, you are likely not going to notice because any slack going on in the pool is most likely dwarfed by the time it takes to execute queries on your database.

In short: connection pools just love it when you return their connections. Or they should anyway.

Christian Vest Hansen
Don't reinvent the wheel. Use a good pool implementation instead - you will need it some day (oh, stale connections hang? etc.)
Thorbjørn Ravn Andersen
This can be a long discussion, but the wheel was not reinvented. Rather, a new wheel design was created with a performance profile distinct from all other existing wheels. Same reason we have multiple java.util.Map implementations ;-)
Christian Vest Hansen
+1  A: 

To really check if your pool is a bottle neck you should profile you program. If you find the pool is a problem, then you have tuning problem. A simple pool should be able to handle 100K allocations per second or more or about 10 micro-seconds. However, as soon as you use a connection, it will take between 200 and 2,000 micro-seconds to do something useful.

Peter Lawrey
+1  A: 

I think this is a poor design. Sounds like a Rete rules engine run amok.

If you assume 0.5-1.0 MB minimum per thread (e.g. for stack, etc.) you'll be thrashing a lot of memory. Checking the connections in and out of the pool will be the least of your problems.

The best way to know is to do a performance test and measure memory, wall times for each operation, etc. But this doesn't sound like it'll end well.

Sometimes I see people assume that throwing all their rules into Blaze or ILOG or JRules or Drools simply because it's "standard" and high tech. It's a terrific resume item, but how many of those solutions would be better served by a simpler table-driven decision tree? Maybe your problem is one of those.

I'd recommend that you get some data, see if there's a problem, and be prepared to redesign if the data tells you it's necessary.

duffymo
A: 

Could you provide more details on what your rules engine does exactly? If each rule "firing" is performing data updates, you may want to verify that the connection is being properly released (Put this in the finally block of your code to ensure that the connections are really being released).

If possible, you may want to consider capturing your data updates to a memory buffer, and write to the database only at the end of the rule session/invocation.

If the database operations are read-only, consider caching the information.

As bad as you think 400-800 connections being created and released to the pool is, I suspect it'll be much much worse if you have to create and close 400-800 unpooled connections.

Jack Leow