tags:

views:

155

answers:

6

I have a query that deletes all rows that have been marked for deletion. There is a column in a table that is named IsDeleted. It is a boolean data type if it is true the row is suppose to be deleted along with all related rows in different tables.

If an article row is marked for deletion then the article comments, votes are also supose to be deleted. Which ORM can efficiently handle this?

Edit

I need this for C# .NET

A: 
  1. Which ORM's support "criteria" based delete... Both of the ones i've worked with (Propel, Doctrine). I would think that nearly all do unless they are early in development as its pretty basic thing. But what language are you working?

  2. As far as your deletion cascade, this is best implemented at the database level with foreign keys. Most RDBMS's support this. If youre using something that doesn't some ORM's implement this as well if support isnt available. But my advice would be to just use an RDBMS that does support it. It will be less headaches in the long run.

prodigitalson
If you want to delete all the logs that are too old (> 5 yrs old) then a FK would be of minimal use. This is probably a similar case, I expect.
James Black
+1  A: 

NHibernate supports HQL (the object oriented Hibernate Query Language) updates and deletes.

There are some examples in this Blog Post by Fabio Maulo and this Blog Post by Ayende Rahien.

It would probably look like this:

using (var session = OpenSession())
using (var tx = s.BeginTransaction())
{
  session
    .CreateQuery("delete from Whatever where IsDelete = true")
    .ExecuteUpdate();
  tx.Commit();
}

Note: this is not SQL. This is HQL containing class names and property names and it translates to (almost) any database.

Stefan Steinegger
A: 

I am using LLBLgen, which can do cascading deletes. You might want to try it, it's very good. Example. Delete all users in username[] from all roles in rolenames[]

string[] usernames;
string[] rolenames;

UserRoleCollection userRoles = new UserRoleCollection ();
PredicateExpression filter = new PredicateExpression();
filter.Add(new FieldCompareRangePredicate(UserFields.logincode, usernames));
filter.AddWithAnd(new FieldCompareRangePredicate(RoleFields.Name, rolenames));
userRoles.DeleteMulti(filter)
edosoft
Can I do multiple row updates and deletes with LLBLgen?
Luke101
Can i do something like this? userRoles.DeleteMulti('query goes here')
Luke101
+1  A: 

Typically, if you are already using an IsDeleted flag paradigm, the items are normally ignored by the application object model, and this is efficient and reliable because no referential integrity is needed to be checked (no cascade), and no data is permanently destroyed.

If you want IsDeleted rows purged on a regular basis, it is far more efficient to schedule these as batch jobs in the RDBMS using native SQL, as long as you remove things in the right order so that referential integrity is not compromised. If you do not enforce referential integrity at the DB-level, then the order doesn't matter.

Even with strong referential integrity and constraints in all my database designs over the years, I have never used cascading RI - it has never been desirable in my designs.

Cade Roux
A: 

Most ORMs will allow you to either give SQL hints, or execute SQL within their framework.

For example, you can use the ExecuteQuery method in DLINQ to do what you want. Here is a brief tutorial on using custom sql with DLINQ.

http://weblogs.asp.net/scottgu/archive/2007/08/27/linq-to-sql-part-8-executing-custom-sql-expressions.aspx

I would expect that the Entity Framework would also allow it, but I have never used it but you can look into it.

Basically, find an ORM that has the features you need, and then you could ask how to do this query in your selected ORM. I think picking an ORM for this one feature is risky as there are many other factors that should go into the selection.

James Black
+2  A: 

DataObjects.Net offers an intermediate solution:

  • Currently it can't perform server-side deletion of entities selected by query. This will be implemented some day, but for now there is another solution.
  • One the other hand, is supports so-called generalized batching: queries it sends are sent in batches by up to 25 items at once, when this is possible. "Possible" means "query result won't be necessary right now". This is almost always correct for creations, updates and deletes. Since such queries always lead to a single (or few, if there is inheritance) seek operations, they're pretty cheap. If they're sent in bulks, SQL Server can cache plans for the whole bulks, not for just individual queries there.

So this is very fast, although not yet ideal:

  • For now DO4 doesn't use IN (...) to optimize such deletions.
  • So far it doesn't support asynchronous batch execution. When this is done (I hope this will be done in a month or so), its speed on CUD (a subset from CRUD) operations will be nearly the same as of SqlBulkCopy (~= 1.5 ... 2 times faster than now).

So in case with DO bulk deletion looks as follows:

var customersToRemove = 
  from customer in Query<Customer>.All
  where customer.IsDeleted
  select customer;

foreach (customer in customersToRemove)
  customer.Remove(); // This will be automatically batched

I can name a benefit of this approach: any of such objects will be able to react on deletion; Session event subscribers will be notified about each deletion as well. So any common logic related to deletions will work as expected. This is impossible, if such operation is executed on server.

Code for soft delete must look like:

var customersToRemove = 
  from customer in Query<Customer>.All
  where ...
  select customer;

foreach (customer in customersToRemove)
  customer.IsRemoved = true; // This will be automatically batched

Obviously, such an approach is slower that bulk server-side update. By our estimates, what we have now is about 5 times slower than true server-side deletion in worst case ([bigint Id, bigint Value] table, clustered primary index, no other indexes); on real-life cases (more columns, more indexes, more data) it must bring a comparable performance right now (i.e. be 2-3 times slower). Asynchronous batch execution will improve this further.

Btw, we shared tests for bulk CUD operations with entities for various ORM frameworks at ORMBattle.NET. Note that tests there don't use bulk server-side updates (in fact, such test would be a test for database performance rather than ORM); instead they test if ORM is capable of optimizing this. Anyway, the info provided there + test code might be helpful, if you're evaluating multiple ORM tools.

Alex Yakunin