views:

257

answers:

3

I have a table like this that stores messages coming through a system:

Message
-------
ID (bigint)
CreateDate (datetime)
Data (varchar(255))

I've been asked to calculate the messages saved per second at peak load. The only data I really have to work with is the CreateDate. The load on the system is not constant, there are times when we get a ton of traffic, and times when we get little traffic. I'm thinking there are two parts to this problem: 1. Determine ranges of time that are considered peak load, 2. Calculate the average messages per second during these times.

Is this the right approach? Are there things in SQL that can help with this? Any tips would be greatly appreciated.

+2  A: 

I agree, you have to figure out what Peak Load is first before you can start to create reports on it.

The first thing I would do is figure out how I am going to define peak load. Ex. Am I going to look at an hour by hour breakdown.

Next I would do a group by on the CreateDate formated in seconds (no milleseconds). As part of the group by I would do an avg based on number of records.

JD
+1  A: 

warning, these will run slow!

this will group your data into "second" buckets and list them from the most activity to least:

SELECT
    CONVERT(char(19),CreateDate,120) AS CreateDateBucket,COUNT(*) AS CountOf
    FROM Message
    GROUP BY CONVERT(Char(19),CreateDate,120)
    ORDER BY 2 Desc

this will group your data into "minute" buckets and list them from the most activity to least:

SELECT
    LEFT(CONVERT(char(19),CreateDate,120),16) AS CreateDateBucket,COUNT(*) AS CountOf
    FROM Message
    GROUP BY LEFT(CONVERT(char(19),CreateDate,120),16)
    ORDER BY 2 Desc

I'd take those values and calculate what they want

KM
Cool, this is similar to what I have so far. Thanks for the answer.
Andy White
+1  A: 

I don't think you'd need to know the peak hours; you can generate them with SQL, wrapping a the full query and selecting the top 20 entries, for example:

select top 20 *
from (
     [...load query here...]
) qry
order by LoadPerSecond desc

This answer had a good lesson about averages. You can calculate the load per second by looking at the load per hour, and dividing by 3600.

To get a first glimpse of the load for the last week, you could try (Sql Server syntax):

select datepart(dy,createdate) as DayOfYear,
       hour(createdate) as Hour, 
       count(*)/3600.0 as LoadPerSecond
from message
where CreateDate > dateadd(week,-7,getdate())
group by datepart(dy,createdate), hour(createdate)

To find the peak load per minute:

select max(MessagesPerMinute)
from (
  select count(*) as MessagesPerMinute
  from message
  where CreateDate > dateadd(days,-7,getdate())
  group by datepart(dy,createdate),hour(createdate),minute(createdate)
)

Grouping by datepart(dy,...) is an easy way to distinguish between days without worrying about month borders. It works until you select more that a year back, but that would be unusual for performance queries.

Andomar