



My software uses multiple threads to do its work. There is a pipeline that looks something like this:

+------------+     ||+-----------------+     +------------+
|            |     |||                 |     |            |
| Get and    |     ||| Worker Threads  |     |  Save      |
| feed work  |--->>|||                 |--->>|     Output |
|            |     |||   Do Work       |     |            |
+------------+     +||                 |     +------------+
                    +|                 |

Each box represents a separate thread. The arrows between them are thread safe queues for "work" objects to flow through. The "Get and feed work" thread pulls waiting work from the database and feeds it to a pool of worker threads. Those worker threads do some work, updating a status flag on the work object (and storing it to the db) as well as producing some output objects. The output objects flow to the "Save Output" thread where they are saved and/or updated in the database.

The purpose of this architecture is mainly one of queuing efficiency.

I need the "Get and Feed Work" thread to have its own DB session/connection so it can continuously read from the DB, unblocked, and feed that data to the worker threads.

Each worker thread also needs their own DB connection/session mainly to update their progress in the DB on whatever task they are working on. These DB connections/sessions are always low-impact, infrequent updates. Each worker thread produces something significant that requires an insert into the DB. Instead of each worker thread doing its own insert, it pushes that responsibility down the line to the "Save Output" thread.

The "Save Output" thread gains efficiency writing to the DB by doing it in batches - inserts are much faster in batches than one at a time.

I'm beginning to think Hibernate may not be appropriate for this architecture.

I'm finding myself running into lots of issues dealing with Hibernate sessions, evicting, merging, clearing, flushing, oh my.

My architecture seems stable currently but it also seems extremely inefficient. Would it be better to abandon Hibernate and use straight JDBC?


You can either detach the objects passed between sessions, or pass only the object ID.

Maurice Perry
Are there Pros/Cons to each method? My natural inclination would be to detach the object as it leaves one thread and merge it when it gets to the new thread.
JR Lawhorne

Hibernate sessions and objects loaded from sessions are not thread safe so they cannot be accessed by different threads simultaneously, however it's ok to access them from different threads sequentially. From the description of your process, there does not seem to be overlap between the threads so you should be fine passings the objects and the session.

Make sure not use thread bound sessions though, don't use SessionFactory.getCurrentSession()

I'm using thread local sessions so it sounds like I'll have to detach.
JR Lawhorne
You can write your own thread local utility class and bind the session to thread when switching to the worker thread. Using detached objects and merging them can generate a lot of queries because Hibernate needs to get the database state. Another drawback is that you lose the ability to use transactions. Since one unit of work involves 3 sessions (get data, work, save data), some concurrent updates might slip in between.