views:

187

answers:

3

Hello,

We have a requirement to delete data in the range of 200K from database everyday. Our application is Java/JEE based using Oracle DB and Hibernate ORM tool.

We explored various options like

  1. Hibernate batch processing
  2. Stored procedure
  3. Database partitioning

Our DBA suggests database partitioning is the best way to go, so we can easily recreate and drop the partitioned table everyday. Now the issue is we have 2 kinds of data, one which we want to delete everyday and the other which we want to keep it. Suppose this data is stored in table "Trade". Now with partitioning, we have 2 tables "Trade". We have already existing Hibernate based DAO layer to fetch/store trades from/to DB. When we decide to partition the database, how can we control the trades to go in which of the two tables through hibernate. Basically I want , the trades need to be deleted by end of the day, to go in partitioned table and the trades I want to keep, in main table. Please suggest how can this be possible with Hibernate. We may add an additional column to identify the trades to be deleted but how can we ensure these trades should go to partitioned trade table using hibernate.

I would appreciate if someone can suggest any better approach in case we are on wrong path.

+1  A: 

When we decide to partition the database, how can we control the trades to go in which of the two tables through hibernate.

That's what Hibernate Shards is for.

Pascal Thivent
Is it still in developpment ? last time i looked at it, they were stuck on some really hard matters... (and last commit on sf.net was done on jan. 21st 2009)
Thierry
@Thierry AFAIK, the project *is* in the Hibernate portfolio. Do you have any particular Jira issue in mind? What additional features are you missing that would require development?
Pascal Thivent
Nop, i've just read the reference documentation to see what it could do. I've seen many limitations (for example 'shard' queries do not support distinct or sorting). The project having received no updates in more than one year, i was just wondering if it was still alive.
Thierry
@Thierry Well, I guess it's a kind of virtuous circle (or vicious circle, as you want): more users, more needs, more activity, more features, more users, etc.
Pascal Thivent
Ok so what is the message for users. I can see it is still in beta. Is it ready for production use? What is the time line for GA release?
Alex
Pascal Thivent
A: 

You could use hibernate inheritance strategy.

If you know at object creation that it will be deleted by the end of the day, you can create a VolatileTrade that is a subclass of Trade (with no other attribute). Use the 'table per concrete class' strategy (section 9.1.5 of hibernate 3.3 reference documentation) for the mapping.

(I think i would do an abstract superclass Trade, and two concrete subclasses : PersistentTrade and VolatileTrade, so that if you have some other classes that you know will reference only PersistentTrade (or Volatile), you can constrain that in your code. If you had used the Trade superclass as the PersistentTrade, you won't be able to enforce that.)

The volatile trade will go in one table and the 'persitent' trade will go in another table.

Be aware that you won't be able to set a fk constraint on any Trade (persistent and volatile) from other table in the db.

Then you just have to clear the table when you want.

Be careful to define a locking mechanism so that no other thread will try to write data to the table during the drop and the create (if you use that). That won't be an easy task, and doing it rightfully might impact the performance of all operation inserting data in the table (as it will require acquiring the lock).

Wouldn't it be more easy to truncate the table ?

Thierry
A: 

Thanks Thierry for proposing this solution. Since the trade creation logic is going to be same(end user won't know whether s/he is creating PersistentTrade or VolatileTrade), what advantage we have with this refactoring over horizontal partitioning.

Secondly all application logic like report generation etc. is going to be same whether it is PersistentTrade or VolatileTrade. So for all practical purposes both trades are same, only difference is, VolatileTrades need to be deleted by end of the day. Probably this solution would require more changes in existing code. It could be a good solution for building a new application with this kind of requirement.

I agree with your point on deletion. I don't think we should be deleting it from application layer. It might be a simple cron job running at the end of the day, when application is not used, to drop the entire data.

Alex
If you just leave all attributes in the Trade superclass (that is the class you currently have ?) and create one (volatile) or two (volatile and persistent) subclasses with no attribute inside it/them, you won't have to refactor anything but the hbm mapping/sql/stored proc/triggers, will you ? Hopefully you don't have too many of them. Your current hql/criteria queries should work.
Thierry
And i would also advise avoiding having application logic (after all you're modifying the application so that the cleaning can be efficient) outside it. Having the 'behaviour' of your application all at the same place will prevent some new worker on the project not seeing there is a cleaning once a day.
Thierry
could u elaborate your point on "you won't have to refactor anything". We need hbm mappings for sure. I need to refactor existing code to create two types of trades(PersistentTrade or VolatileTrade) to persist it in different tables and also need to modify query to retrieve trades from respective tables.Is it not so?
Alex
Yep, so instead of new Trade(), you create one or another depending on the situation. For the query, try in criteria or hql something like "from Trade", and you will see hibernate automatically doing an union between the two tables so that you get all Trades (Persistent and Volatile)
Thierry