views:

193

answers:

3

Hey all,

I'm experiencing massive slowness when accessing one of my tables and I need some re-factoring advice. Sorry if this is not the correct area for this sort of thing.

I'm working on a project that aims to report on server performance statistics for our internal servers. I'm processing windows performance logs every night (12 servers, 10 performance counters and logging every 15 seconds). I'm storing the data in a table as follows:

CREATE TABLE [dbo].[log](
    [id] [int] IDENTITY(1,1) NOT NULL,
    [logfile_id] [int] NOT NULL,
    [test_id] [int] NOT NULL,
    [timestamp] [datetime] NOT NULL,
    [value] [float] NOT NULL,
CONSTRAINT [PK_log] PRIMARY KEY CLUSTERED 
(
    [id] ASC
)WITH FILLFACTOR = 90 ON [PRIMARY]
) ON [PRIMARY]

There's currently 16,529,131 rows and it will keep on growing.

I access the data to produce reports and create graphs from coldfusion like so:

SET NOCOUNT ON

CREATE TABLE ##RowNumber ( RowNumber int IDENTITY (1, 1), log_id char(9) )

INSERT ##RowNumber (log_id)
SELECT l.id
FROM log l, logfile lf
WHERE lf.server_id = #arguments.server_id#
and l.test_id = #arguments.test_id#"
and l.timestamp >= #arguments.report_from#
and l.timestamp < #arguments.report_to#
and l.logfile_id = lf.id
order by l.timestamp asc

select rn.RowNumber, l.value, l.timestamp
from log l, logfile lf, ##RowNumber rn
where lf.server_id = #arguments.server_id#
and l.test_id = #arguments.test_id#
and l.logfile_id = lf.id
and rn.log_id = l.id
and ((rn.rownumber % #modu# = 0) or (rn.rownumber = 1)) 
order by l.timestamp asc

DROP TABLE ##RowNumber

SET NOCOUNT OFF

(for not CF devs #value# inserts value and ## maps to #)

I basically create a temporary table so that I can use the rownumber to select every x rows. In this way I'm only selecting the amount of rows I can display. this helps but it's still very slow.

SQL Management Studio tells me my index's are as follows (I have pretty much no knowledge about using index's properly):

IX_logfile_id (Non-Unique, Non-Clustered)
IX_test_id (Non-Unique, Non-Clustered)
IX_timestamp (Non-Unique, Non-Clustered)
PK_log (Clustered)

I would be very grateful to anyone who could give some advice that could help me speed things up a bit. I don't mind re-organising things and I have complete control of the project (perhaps not over the server hardware though).

Cheers (sorry for the long post)

+1  A: 

A couple things come to mind.

  1. Do you need to keep that much data? If not, consider either creating an archive table if you want to keep it (but don't create it just to join it with the primary table everytime you run a querry).

  2. I would avoid using a temp table with so much data. See this article on temp table performance and how to avoid using them.

http://www.sql-server-performance.com/articles/per/derived%5Ftemp%5Ftables%5Fp1.aspx

  1. It looks like you are missing an index on the server_id field. I would consider creating a covered index using this field and otheres. Here is an article on that as well.

http://www.sql-server-performance.com/tips/covering%5Findexes%5Fp1.aspx

Edit

  1. With that many rows in the table over such a short time frame, I would also check the indexes for fragmintation which may be a cause for slowness. In SQL Server 2000 you can use the DBCC SHOWCONTIG command.

See this link for info http://technet.microsoft.com/en-us/library/cc966523.aspx

Also, please note that I have numbered these items as 1,2,3,4 however the editor is automatically resetting them

Irwin M. Fletcher
great links! 1. So far I have 4 months worth of data, I would ideally like to keep 6 months in the current table then I could think about archiving.2. I will run some tests using that method1 (again). server_id is part of another table (called logfile). each server has multiple logfiles and each logfile has multiple logs. can I still include this in the index?
OrganicPanda
No, you could not include this field in a covered index if it is in a different table. I have updated my answer with one other point.
Irwin M. Fletcher
Having run those checks I see I've got a logical scan fragmentation of 2.3% and an extent scan fragmentation of 0.11%. My page density is at 90% which, according to that document, is a healthy amount. These numbers look good to me but I will make sure to add Defragging as that document recommends.
OrganicPanda
A: 

Once when still working with sql server 2000, i needed to do some paging, and i came accross a method of paging that realy blew my mind. Have a look at this method.

DECLARE @Table TABLE(
     TimeVal DATETIME
)

DECLARE @StartVal INT
DECLARE @EndVal INT

SELECT  @StartVal = 51, @EndVal = 100

SELECT  *
FROM    (
      SELECT TOP (@EndVal - @StartVal + 1)
        *
      FROM (
         --select up to end number
         SELECT TOP (@EndVal)
           *
         FROM @Table
         ORDER BY TimeVal ASC
        ) PageReversed
      ORDER BY TimeVal DESC
     ) PageVals
ORDER BY TimeVal ASC

As an example

SELECT  *
FROM    (
      SELECT TOP (@EndVal - @StartVal + 1)
        *
      FROM (
         SELECT TOP (@EndVal)
           l.id,
           l.timestamp
         FROM log l, logfile lf
         WHERE lf.server_id = #arguments.server_id#
         and l.test_id = #arguments.test_id#"
         and l.timestamp >= #arguments.report_from#
         and l.timestamp < #arguments.report_to#
         and l.logfile_id = lf.id
         order by l.timestamp asc
        ) PageReversed ORDER BY timestamp DESC
     ) PageVals
ORDER BY timestamp ASC
astander
That looks like a very good way to emulate MySQLs limit functionality that I miss so much! The problem is that I don't know where the data is that I want, I have to find data between dates and the data isn't always in the right order so I don't think I can use this now but I will definitely save it for later.
OrganicPanda
See the edit for an example as i would try it.
astander
That works surprisingly fast! I'll see if I can use this. Thanks very much!
OrganicPanda
+4  A: 

Your problem is that you chose a bad clustered key. Nobody is ever interested in retrieving one particular log value by ID. I your system is like anything else I've seen, then all queries are going to ask for:

  • all counters for all servers over a range of dates
  • specific counter values over all servers for a range of dates
  • all counters for one server over a range of dates
  • specific counter for specific server over a range of dates

Given the size of the table, all your non-clustered indexes are useless. They are all going to hit the index tipping point, guaranteed, so they might just as well not exists. I assume all your non-clustered index are defined as a simple index over the field in the name, with no include fields.

I'm going to pretend I actually know your requirements. You must forget common sense about storage and actually duplicate all your data in every non-clustered index. Here is my advice:

  • Drop the clustered index on [id], is a as useless as is it gets.
  • Organize the table with a clustered index (logfile_it, test_id, timestamp).
  • Non-clusterd index on (test_id, logfile_id, timestamp) include (value)
  • NC index on (logfile_id, timestamp) include (value)
  • NC index on (test_id, timestamp) include (value)
  • NC index on (timestamp) include (value)
  • Add maintenance tasks to reorganize all indexes periodically as they are prone to fragmentation

The clustered index covers the query 'history of specific counter value at a specific machine'. The non clustered indexes cover various other possible queries (all counters at a machine over time, specific counter across all machines over time etc).

You notice I did not comment anything about your query script. That is because there isn't anything in the world you can do to make the queries run faster over the table structure you have.

Now one thing you shouldn't do is actually implement my advice. I said I'm going to pretend I know your requirements. But I actually don't. I just gave an example of a possible structure. What you really should do is study the topic and figure out the correct index structure for your requirements:

Also a google on 'covering index' will bring up a lot of good articles.

And of course, at the end of the day storage is not free so you'll have to balance the requirement to have a non-clustered index on every possible combination with the need to keep the size of the database in check. Luckly you have a very small and narrow table, so duplicating it over many non-clustered index is no big deal. Also I wouldn't be concerned about insert performance, 120 counters at 15 seconds each means 8-9 inserts per second, which is nothing.

Remus Rusanu
Excellent answer! Thank you so much for your advice and links. I will study up on the subject further but you have given me a great start point to go from.
OrganicPanda
Sorry one last thing. I'm just trying to digest your advice and when you say `Non-clusterd index on (test_id, logfile_id, timestamp) include (value)` do you mean having the NC index with `test_id, logfile_id, timestamp, value` or is there a special mechanism to `include` a column. Sorry I'm just a bit confused because it wasn't in the list with the other columns, Thanks again.
OrganicPanda
There is a special mechanism to include columns, since SQL 2005 though. In 2000 you have to add the 'included' column to the index key itself.
Remus Rusanu