views:

112

answers:

3

I have a few data values that I need to store on my rails app and wanted to know if there are any alternatives to creating a database table just to do this simple task.

Background: I'm writing some analytics and dashboard tools for my ruby on rails app and i'm hoping to speed up the dashboard by caching results that will never change. Right now I pull all users for the last 30 days, and re arange them so I can see the number of new users per day. It works great but takes quite a long time, in reality I should only need to calculate the most recent day and just store the rest of the array somewhere else.

Where is the best way to store this array?

Creating a database table seems a bit overkill, and i'm not sure that global variables are the correct answer. Is there a best practice for persisting data like this?

If anyone has done anything like this before let me know what you did and how it turned out.

+1  A: 

Using a lightweight database like sqlite shouldn't feel like an overkill. Alternatively, you can use key-store solutions like tokyo cabinet or even store the array in a flat file manually but I really don't see any overkill in using sqlite.

Evgeny Shadchnev
I guess the overkill i feel would come from writing the schema, migrating the database, dealing with two different adapters in my Rails Project(currently using MYSQL for everythign else), and then writing the sql queries (because these items are not tied with a model)...when at the end of the day all i want back is [1,2,3,4,5]. I don't mind doing all of this, i'm just curious to see how others have approached the same scenario.
ThinkBohemian
+4  A: 

Ruby has a built-in Hash-based key value store named PStore. This provides simple file based, transactional persistance.

John Topley
I like this a lot, didn't know that it existed, based on the problem description, and experience would you recommend this method over the other suggestions of "just use the DB"?
ThinkBohemian
If your use-case is just serializing an array, which it sounds like it is, then why not? If it doesn't work out for you then it's easy to change to another solution down the line.
John Topley
+1  A: 

If you've got a database already, it's really not a big deal to create a separate table for tracking this sort of thing. When doing reporting, it's often to your advantage to create derivative summary tables exactly like what you're describing. You can update these as required using a simple SQL statement and there's no worry that your temporary store will somehow go away.

That being said, the type of report you're trying to generate is actually something that can be done in real-time except on extravagantly large data sets. The key is to have indexes that describe the exact grouping operation you're trying to do. For instance, if you're grouping by calendar date, you can create a "date" field and sync it to the "created_at" time as required. An index on this date field will make doing a GROUP BY created_date very quick:

SELECT created_date AS on_date, COUNT(id) AS new_users FROM users GROUP BY created_date
tadman
Unfortunately i'm not doing this just for users, but a few other elements as well such as the number of emails sent per day which is in the thousands (per day) so pulling the last 30 days worth of model data takes forever. Then once i get the objects i have to take created_at (date/time object) and evaluate iterate through to group the objects. Maybe there is a better way, but i haven't been able to hit the nail on the head yet.
ThinkBohemian
Adding an indexable column, where it's a date and not a date-time, will help when generating the reports in the first place. Adding one day worth of data at a time is also fairly efficient, even for large volumes, but the set-up time for adding all the historical data can be considerable. Inserting a grouped count should take only a few seconds, and would only have to be done once per day at most, easily done as a background job or cron task. Don't actually load the models if all you want to do is count them. Just use SQL directly.
tadman