views:

296

answers:

3

Here is the issue.

On a site I've recently taken over it tracks "miles" you ran in a day. So a user can log into the site, add that they ran 5 miles. This is then added to the database.

At the end of the day, around 1am, a service runs which calculates all the miles, all the users ran in the day and outputs a text file to App_Data. That text file is then displayed in flash on the home page.

I think this is kind of ridiculous. I was told they had to do this due to massive performance issues. They won't tell me exactly how they were doing it before or what the major performance issue was.

So what approach would you guys take? The first thing that popped into my mind was a web service which gets the data via an AJAX call. Perhaps every time a new "mile" entry is added, a trigger is fired and updates the "GlobalMiles" table.

I'd appreciate any info or tips on this.

Thanks so much!

+1  A: 

If they are truely having performance issues due to to many hits on the database then I suggest that you take all the input and cram it into a message queue (MSMQ). Then you can have a service on the other end that picks up the messages and does a bulk insert of the data. This way you have fewer db hits. Then you can output to the text file on the update too.

Andrew Siemer
+1  A: 

I would create a summary table that's rolled up once/hour or nightly which calculates total miles run. For individual requests you could pull from the nightly summary table plus any additional logged miles for the period between the last rollup calculation and when the user views the page to get the total for that user.

How many users are you talking about and how many log records per day?

jn29098
i don't know how many concurrent users we will have at this point, but i do know the site has over 1 million members.If i get it from the night summary, that isn't real time is it? I don't know if they want the counter to actively change as you view it, but I think they want it to be as up to date as possible when viewers load the page.
Jack Marchetti
Adding the nightly summary to any additional log records they've created since the summary was generated would make the calculation quicker. You could keep the the "current" logs in a separate table or sql server table partition than those that had already been calculated into the summary values. This would keep the number of rows that have to be dynamically calculated at a minimum on any given day.
jn29098
+1  A: 

Answering this question is a bit difficult since there we don't know all of your requirements and something didn't work before. So here are some different ideas.

First, revisit your assumptions. Generating a static report once a day is a perfectly valid solution if all you need is daily reports. Why hit the database multiple times throghout the day if all that's needed is a snapshot (for instance, lots of blog software used to write html files when a blog was posted rather than serving up the entry from the database each time -- many still do as an optimization). Is the "real-time" feature something you are adding?

I wouldn't jump to AJAX right away. Use the same input method, just move the report from static to dynamic. Doing too much at once is a good way to get yourself buried. When changing existing code I try to find areas that I can change in isolation wih the least amount of impact to the rest of the application. Then once you have the dynamic report then you can add AJAX (and please use progressive enhancement).

As for the dynamic report itself you have a few options.

Of course you can just SELECT SUM(), but it sounds like that would cause the performance problems if each user has a large number of entries.

If your database supports it, I would look at using an indexed view (sometimes called a materialized view). It should support allows fast updates to the real-time sum data:

CREATE VIEW vw_Miles WITH SCHEMABINDING AS 
SELECT SUM([Count]) AS TotalMiles, 
COUNT_BIG(*) AS [EntryCount],
UserId
FROM Miles
GROUP BY UserID
GO
CREATE UNIQUE CLUSTERED INDEX ix_Miles ON vw_Miles(UserId)

If the overhead of that is too much, @jn29098's solution is a good once. Roll it up using a scheduled task. If there are a lot of entries for each user, you could only add the delta from the last time the task was run.

UPDATE GlobalMiles SET [TotalMiles] = [TotalMiles] + 
  (SELECT SUM([Count]) 
    FROM Miles 
    WHERE UserId = @id 
      AND EntryDate > @lastTaskRun
    GROUP BY UserId)
WHERE UserId = @id

If you don't care about storing the individual entries but only the total you can update the count on the fly:

UPDATE Miles SET [Count] = [Count] + @newCount WHERE UserId = @id

You could use this method in conjunction with the SPROC that adds the entry and have both worlds.

Finally, your trigger method would work as well. It's an alternative to the indexed view where you do the update yourself on a table instad of SQL doing it automatically. It's also similar to the previous option where you move the global update out of the sproc and into a trigger.

The last three options make it more difficult to handle the situation when an entry is removed, although if that's not a feature of your application then you may not need to worry about that.

Now that you've got materialized, real-time data in your database now you can dynamically generate your report. Then you can add fancy with AJAX.

Talljoe
Yeah we want a real time count on the website home page.Basically there is a small area of the site that says:500,401 miles ran so far.if a user enters 5 miles, then that counter should immediately read:500,406 miles ran so farRight now it does a nightly sum, and the client wants it to be a real time assessment.I haven't figured out why they couldn't do it, or why they ran into so much performance issues. Then again, their architecture is god awful
Jack Marchetti