views:

392

answers:

2

I need to do paging with the sort order based on a calculation. The calculation is similar to something like reddit's hotness algorithm in that its dependant on time - time since post creation.

I'm wondering what the best practice for this would be. Whether to have this sort as a SQL function, or to run an update once an hour to calculate the whole table.

The table has hundreds of thousands of rows. And I'm using nhibernate, so this could cause problems for the scheduled full calcution.

Any advice?

A: 

If you can perform the calculation using SQL, try use Hibernate to load the sorted collection by executing a SQLQuery, where your query includes a 'ORDER BY' expression.

Nuno G
+2  A: 

It most likely will depend a lot on the load on your server. A few assumptions for my answer:

  1. Your calculation is most likely not simple, but will take into account a variety of factors, including time elapsed since post

  2. You are expecting at least reasonable growth in your site, meaning new data will be added to your table.

I would suggest your best bet would be to calculate and store your ranking value, and as Nuno G mentioned retrieve using an ordered clause. As you note there are likely to be some implications, two of which would be:

  1. Scheduling Updates
  2. Ensuring access to the table

As far as scheduling goes you may be able to look at some ways of intelligently recalculating your value. For example, you may be able to identify when a calculation is likely to be altered (for example, if a dependant record is updated you might fire a trigger, adding the ID of your table to a queue for recalculation). You may also do the update in ranges, rather then in the full table.

You will also want to minimise any locking of your table whilst you are recalculating. There are a number of ways to do this, including setting your isolation levels (using MS SQL terminonlogy). If you are really worried you could even perform your calculation externally (eg. in a temp table) and then simply run an update of the values to your main table.

As a final note I would recommend looking into the paging options available to you - if you are talking about thousands of records make sure that your mechanism determines the page you need on the SQL server so that you are not returning the thousands of rows to your application, as this will slow things down for you.

Chris