First of all I assume I've missed something major when thinking about this, but I still wanted to post about it to see if I really didn't miss anything, on with it...
I have a pretty write heavy binary tree (about 50/50 between writes and reads), and on the way home today I was thinking about ways to optimize this, especially make the writes faster - this is what I came up with.
Considering that the operation add(T, x) to add x to tree T first consists of find(T, x) to see if x already exists, and in that case it doesn't return the parent so we can add it instead of one of the parents empty leaves.
What if we add a hash table as an intermediate cache to the add operation, so when we call add(T, x) what really happens is that x is hashed and inserted into the hash map M. And that's it. The optimization takes place when we somewhere else asks to find(T, x), now when we search the tree we will come to a leaf node, since x isn't inserted the tree yet (it only exists in the hash map M), we hash x and compare it to the keys in M to see if it is supposed to be in the tree. If it's found in M then we add it to the tree and remove it from M.
This would eliminate the find(T, x) operation on add(T, x) and reduce it to add(M, x) which is O(1). And then (ab)-use the find(T, x) operation that is performed when we look up the node the first time to insert it.