views:

110

answers:

8

Say you've got a database like this:

books
-----
id
name

And you wanted to get the total number of books in the database, easiest possible sql:

"select count(id) from books"

But now you want to get the total number of books last month...

Edit: but some of the books have been deleted from the table since last month

Well obviously you cant total for a month thats already past - the "books" table is always current and some of the records have already been deleted

My approach was to run a cron job (or scheduled task) at the end of the month and store the total in another table, called report_data, but this seems clunky. Any better ideas?

A: 

The only way to do what you want is to add a column to the books table "date_added". Then you could run a query like

select count(id) from books where date_added <= '06/30/2009';
Chris Kaminski
I should have mentioned that the books can be deleted from the table. Sorry to send some of you down the wrong track
dittonamed
+3  A: 

Add a default column that has the value GETDATE(), call it "DateAdded". Then you can query between any two dates to find out how many books there were during that date period or you can just specify one date to find out how many books there were before a certain date (all the way into history).

Per comment: You should not delete, you should soft delete.

JP Alioto
shouldn't delete, you say? We're getting close here. Btw, i usually do delete when its something simple like this
dittonamed
But it's not simple now b/c you're trying to report on the historical state of your database. :) So, your requirements are making it so hard delete is not the right answer.
JP Alioto
true true. i never liked doing this because of the extra sql (where is_deleted <> 1). Any MySQL shortcuts or tricks out there for handling this?
dittonamed
+1  A: 

I agree with JP, do a soft delete/logical delete. For the one extra AND statement per query it makes everything a lot easier. Plus, you never lose data.

Granted, if extreme size becomes an issue, then yeah, you'll potentially have to start physically moving/removing rows.

brack
+1  A: 

My approach was to run a cron job (or scheduled task) at the end of the month and store the total in another table, called report_data, but this seems clunky.

I have used this method to collect and store historical data. It was simpler than a soft-delete solution because:

  • The "report_data" table is very easy to generate reports/graphs from
  • You don't have to implement special soft-delete code for anything that needs to delete a book
  • You don't have to add "and active = 1" to the end of every query that selects from the books table

Because the code to do the historical reporting is isolated from everything else that uses books, this was actually the less clunky solution.

Joe Daley
One negative here is that you cant ever see that data again if you needed to see how the number was generated
dittonamed
+1  A: 

If you needed data from the previous month then you should not have deleted the old data. Instead you can have a "logical delete."

I would add a status field and some dates to the table.

books
_____
id
bookname
date_added
date_deleted
status (active/deleted)

From there you would be able to query:

SELECT count(id) FROM books WHERE date_added <= '06/30/2009' AND status = 'active'

NOTE: It my not be the best schema, but you get the idea... ;)

wenbert
+1  A: 

If changing the schema of the tables is too much work I would add triggers that would track the changes. With this approach you can track all kinds of things like date added, date deleted etc.

ElGringoGrande
+1  A: 

Looking at your problem and the reluctance in changing the schema and the code, I would suggest you to go with your idea of counting the books at the end of each month and storing the count for the month in another table. You can use database scheduler to invoke a SP to do this.

Bhushan
+1  A: 

You have just taken a baby step down the road of history databases or data warehousing.

A data warehouse typically stores data about the way things were in a format such that later data will be added to current data instead of superceding current data. There is a lot to learn about data warehousing. If you are headed down that road in a serious way, I suggest a book by Ralph Kimball or Bill Inmon. I prefer Kimball.

Here's the websites: http://www.ralphkimball.com/

http://www.inmoncif.com/home/

If, on the other hand, your first step into this territory is the only step you plan to take, your proposed solution is good enough.

Walter Mitty