views:

576

answers:

7

When Using a thread-local database connection, closure of the connection is required when the thread exists.

This I can only do if I can override the run() method of the calling thread. Even that is not a great solution, since at time of exit, I don't know if a connection has ever been opened by that thread.

The problem is in fact more general: How to force a thread to call some finalisation method of a thread-local object when it exits.

I looked at the sources of java 1.5 and found that the thread local map is set to null, which will eventually cause the garbage collection to call finalize(), but I don't want to count on the garbage collector.

The following override seems inevitable to make sure that a database connection is closed:

@Override 
public void remove() {
    get().release(); 
    super.remove(); 
}

where release() closes the database connection, if it has been opened. But we don't know if the thread has ever used this thread-local. If get() has never been called by this thread, then there's quite a waste of effort here: ThreadLocal.initialValue() will be called, a map will be created on this thread, etc.

-

Further clarification and example as per Thorbjørn's comment:

java.lang.ThreadLocal is a type of factory for an object that is bound to a thread. This type has a getter for the object and a factory method (typically written by the user). When the getter is called it calls the factory method only if it has never been called before by this thread.

Using ThreadLocal allows a developer to bind a resource to a thread even if the thread code was written by a third party.

Example: Say we have a resource Type called MyType and we want to have one and only one of it per thread.

Define in the using class: private static ThreadLocal resourceFactory = new ThreadLocal(){ @override protected MyType initialValue(){ return new MyType(); } }

Use in local context in this class: public void someMethod(){ MyType resource = resourceFactory.get(); resource.useResource(); }

get() can call initialValue() only once in the life cycle of the calling thread. At that point an instance of MyType gets instantiated and bound to this thread. Subsequent calls to get() by this thread refer again to this object.

The classic usage example is when MyType is some thread-unsafe text/date/xml formatter.

But such formatters usually don't need to be released or closed, database connections do and I am using java.lang.ThreadLocal to have a one database connection per thread.

The way I see it, java.lang.ThreadLocal is almost perfect for that. Almost because there's no way to guarantee closure of the resource if the calling thread belongs to a third party application.

I need your brains squires: By extending java.lang.ThreadLocal I managed to bind one database connection for every thread, for it's exclusive usage - including threads that I can not modify or override. I managed to make sure that the connections get closed in case the thread dies on uncaught exception.

In case of normal thread exit, the garbage collector closes the connection (because MyType overrides the finalize()). In actual fact it happens quite quickly, but this is not ideal.

If I had my way, there would have been another method on the java.lang.ThreadLocal:

protected void release() throws Throwable {}

If this method existed on java.lang.ThreadLocal, called by JVM upon any thread exit/death, then in my own override of it I could close my connection (and the redeemer would have come to Zion).

In the absence of such method, I'm looking for another way to confirm closure. A way that won't rely on the JVM garbage collection.

Thanks

+4  A: 

Wrap your Runnable with a new Runnable with a

try {
  wrappedRunnable.run();
} finally {
  doMandatoryStuff();
}

construction, and let THAT be executed instead.

You may also want to consider using an Executor instead. Makes it much easier to manage Runable and Callables.

Thorbjørn Ravn Andersen
Thanks. Indeed I wrap the run where I can. There are still two problems here:1. Some threads that use the thread-local object are third party threads and I have no way to wrap their run()2. The finalisation (what you call mandatoryStuff) requires accessing the thread-local object. Alas if it has never been accessed by this thread, then the object has to be be instantiated and initialised. In this case, a database connection will be opened... only so it can be closed.(I have found a way around it - quite ugly though)Is there a way to check whether a thread-local has been initialised?
Joel
I am not certain I fully understand what you describe. Can you create a minimalistic, working example?
Thorbjørn Ravn Andersen
+1  A: 

You had to open the connection once, so you also have to handle the closure at the same place. Depending on your environment the threads may be reused and you cannot expect the thread to be garbage collected before the shutdown of the application.

pgras
+1  A: 

I think in the general case there is no good solution for this, other than the classic: Code that gets the resource has to be responsible for closing it.

In the specific case, if you are calling threads to this degree you could pass in the connection to your methods at the start of the thread, either with a method that takes a custom parameter or via some form of Dependency Injection. Then since you have the code that supplies the connection, you have the code that removes it.

A Dependency Injection based on annotations might work here, as code that doesn't require the connection won't get one and therefore wouldn't need to be closed, but it sounds like your design is too far along to retrofit something like that.

Yishai
אחי, if you're familiar with java.lang.ThreadLocal you know that it allows you to define the instantiation of your object but not the finalisation of it. There is of course java.lang.Object.finalize() and I do use this, but for what I know, it is not called by the exiting thread but only later, after it's death, by the garbage collector.
Joel
@Joel, I think we are saying the same thing. The creator (the first caller of the get method) has to close the resource. All I am saying is that you can decouple the creator from the rest of the code, although it may not make sense to do so.
Yishai
+3  A: 

The normal JDBC practice is that you close the Connection (and Statement and ResultSet) in the very same method block as you'd acquired it.

In code:

Connection connection = null;

try {
    connection = getConnectionSomehow();
    // Do stuff.
} finally {
    if (connection != null) {
        try {
            connection.close();
        } catch (SQLException e) {
            ignoreOrLogItButDontThrowIt(e);
        }
    }
}

Keeping this in mind, your question makes me think that there's something wrong in your design. Acquiring and closing those kind of expensive and external resources in the shortest scope will save the application from potential resource leaks and crashes.

If your original intent was to improve connecting performance, then you need to take a look for connection pooling. You can use for example the C3P0 API for this. Or if it's a webapplication, use the appserver's builtin connection pooling facilities in flavor of a DataSource. Consult the appserver specific documentation for details.

BalusC
The example shown above - opening a connection on the fly - is something I wish to avoid. It is only feasible if you can trust a pool of connections. The intention of my design is to have a connection bound to a thread - so every thread can have it's one and only exclusive connection, if it needs to have one. Now, I do not create all the threads that could call my queries. Therefore I can not assign a database connection to a thread unless I use the java.lang.ThreadLocal type. It turned out to be a good solution. Only problem is to close when a (third party) thread exits
Joel
Clear but imho still risky to resource leaks. A thread may deadlock or run longer than the connection is allowed to stay open.
BalusC
+1  A: 

Override the get() method in ThreaedLocal so that it sets a List property on the subclass. This property can easily be queried to determine if the get() method had been called for a particular thread. You could then access the ThreadLocal to clean it up in that case.

Updated in response to comment

Martin OConnor
Thanks. From your answer it's not clear to me how the new property can tell which thread has called get() and which one hasn't yet.
Joel
In that case you could use a List<Thread> to store all the Threads that access the ThreadLocal
Martin OConnor
+3  A: 

I don't really understand why you are not using traditional connection pooling. But I shall assume you have your reasons.

How much freedom do you have? As some DI frameworks do support object life cycles and thread-scoped variables (all nicely proxied). Could you use one of them? I think Spring would do this all out of the box, while Guice would need a 3rd party lib to handle life cycles and thread scopes.

Next how much control do you have on either the creating of the ThreadLocal variable or the creation of threads? I am guessing you have complete control on the ThreadLocal but limited to none on the creation of threads?

Could you use Aspect-oriented programming to monitor for new Runnable or Threads extending the run() method to include the clean up? You will also need to extend the ThreadLocal so it can register itself.

mlk
+4  A: 

If you are of a sensitive disposition, look away now.

I wouldn't expect this to scale very well; it effectively doubles the number of threads in the system. There may be some use cases where it is acceptable.

public class Estragon {
  public static class Vladimir {
    Vladimir() { System.out.println("Open"); }
    public void close() { System.out.println("Close");}
  }

  private static ThreadLocal<Vladimir> HOLDER = new ThreadLocal<Vladimir>() {
    @Override protected Vladimir initialValue() {
      return createResource();
    }
  };

  private static Vladimir createResource() {
    final Vladimir resource = new Vladimir();
    final Thread godot = Thread.currentThread();
    new Thread() {
      @Override public void run() {
        try {
          godot.join();
        } catch (InterruptedException e) {
          // thread dying; ignore
        } finally {
          resource.close();
        }
      }
    }.start();
    return resource;
  }

  public static Vladimir getResource() {
    return HOLDER.get();
  }
}

Better error handling and so on is left as an exercise for the implementer.

You could also have a look at tracking the threads/resources in a ConcurrentHashMap with another thread polling isAlive. But that solution is the last resort of the desperate - objects will probably end up being checked too often or too seldom.

I can't think of anything else that doesn't involve instrumentation. AOP might work.

Connection pooling would be my favoured option.

McDowell
Good idea! It would double the number of threads but I believe we can manage this.Polling the threads is out of the question. If I keep a map of threads I might as well drop ThreadLocal altogether.Thanks
Joel
I could argue about connection pooling. However, the database connection is just an example. The more general issue is that the ThreadLocal system of Java is crippled by it's inability to guarantee an immediate release of resources when a thread exits. Because of that it's less than perfect for resources that need to be released.
Joel
I understand, but leaving resources open until the thread dies is also less than optimal in (arguably) most situations. For example, on a Java EE server, it is possible that threads will be pooled and reused for many work units (servlet requests, EJB transactions, etc.) that may be seldom or incidentally used by your code path. So, the VM may accumulate many open, idle connections. This solution should be used rarely, and only when you know or control thread creation/usage details.
McDowell
You're right yet I believe that ThreadLocal is not designed for the type of thread usage that you're describing. On the other hand, it is also not meant for the opposite case, where you have guaranteed full command over all thread creation (then you could assign a variable directly to the thread). Here, most threads are generated by our home-made local library, either directly or via calls to third party API so I'm happy to have a DB connection bound to every thread who needs it - this I believe is the case for which ThreadLocal was made. But it's less than perfect when it comes to releasing
Joel