I have been looking at the differences between Lucene 2.9 particular the redone tokenstream API and it just occurs to me its particularly ugly compared to the old just return a new or repopulate the given with values if your reusing said Token.
I have not done any profiling but it seems using a MAP to store attributes is not that efficient and it would be easier to just create a new value type holding values etc. The TokenStream and Attribute stuff looks like object pooling which is pretty much never necessary these days for simple value types like a Token of text.