tags:

views:

46

answers:

3

I'm using SQL Server 2005.

I have a table whose row size should be 124 bytes. It's all ints or floats, no NULL columns (so everything is fixed width).

There is only one index, clustered. The fill factor is 0.

Here's the table def:

create table OHLC_Bar_Trl
(
    obt_obh_id int NOT NULL REFERENCES OHLC_Bar_Hdr (obh_id),
    obt_bar_start_ms int NOT NULL,
    obt_bar_end_ms int NOT NULL,
    obt_last_price float NOT NULL,
    obt_last_ms int NOT NULL,
    obt_bid_price float NOT NULL,
    obt_bid_size int NOT NULL,
    obt_bid_ms int NOT NULL,
    obt_bid_pexch_price float NOT NULL,
    obt_ask_price float NOT NULL,
    obt_ask_size int NOT NULL,
    obt_ask_ms int NOT NULL,
    obt_ask_pexch_price float NOT NULL,
    obt_open_price float NOT NULL,
    obt_open_ms INT NOT NULL,
    obt_high_price float NOT NULL,
    obt_high_ms INT NOT NULL,
    obt_low_price float NOT NULL,
    obt_low_ms INT NOT NULL,
    obt_volume float NOT NULL,
    obt_vwap float NOT NULL
)
go

create unique clustered index idx on OHLC_Bar_Trl (obt_obh_id,obt_bar_end_ms)

After inserting a ton of data, sp_spaceused returns the following

name            rows        reserved           data               index_size         unused
OHLC_Bar_Trl    117076054   29807664 KB        29711624 KB        92344 KB           3696 KB

which shows a rowsize of approx (29807664*1024)/117076054 = 260 bytes/row.

Where's the rest of the space?

Is there some DBCC command I need to run to tighten up this table (I could not insert the rows in correct index order, so maybe it's just internal fragmentation)?

+1  A: 

To update "space used" statistics, use the 2nd parameter @updateusage of sp_spaceused:

EXEC sp_spaceused 'OHLC_Bar_Trl', 'true'

However, I'd also run ALTER INDEX ALL ON OHLC_Bar_Trl WITH REBUILD first to defrag the data.

gbn
Is that ALTER INDEX going to require enough space for a complete copy f the data? If so, I think I'm toast. In that case I might have to bcp the data out (to a network disk), then bcp it back in. Which, in the olden days, would often achieve my objective.
Eric H.
+1  A: 

You can use sys.dm_db_index_physical_stats to get pretty detailed information on how data is stored in a given table. It's not the clearest thing to use, here's the template I built up over time for my first pass on troubleshooting:

--  SQL 2005 - fragmentation & air bubbles
 SELECT
   ob.name [Table], ind.name [Index], ind.type_desc IndexType
  ,xx.partition_number      PartitionNo
  ,xx.alloc_unit_type_desc  AllocationTyp
  ,xx.index_level
  ,xx.page_count        Pages
  ,xx.page_count / 128  Pages_MB
  ,xx.avg_fragmentation_in_percent  AvgPctFrag
  ,xx.fragment_count
  ,xx.avg_fragment_size_in_pages  AvgFragSize
  ,xx.record_count      [Rows]
  ,xx.forwarded_record_count  [ForwardedRows]
  ,xx.min_record_size_in_bytes        MinRowBytes
  ,xx.avg_record_size_in_bytes        AvgRowBytes
  ,xx.max_record_size_in_bytes        MaxRowBytes
  ,case xx.page_count
     when 0 then 0.0
     else xx.record_count / xx.page_count
   end AvgRowsPerPage
  ,xx.avg_page_space_used_in_percent  AvgPctUsed
  ,xx.ghost_record_count
  ,xx.version_ghost_record_count
 from sys.dm_db_index_physical_stats
   (
     db_id('MyDatabase')
    ,object_id('MyTable')
    ,null
    ,null
    ,'Detailed'
   ) xx
  inner join sys.objects ob
   on ob.object_id = xx.object_id
  inner join sys.indexes ind
   on ind.object_id = xx.object_id
    and ind.index_id = xx.index_id

Use this to check if SQL thinks the row is as long as you think it is, or if there's extra space being used/wasted somewhere.

Philip Kelley
Well, I started to run that, and my machine thrashed almost to a halt. I'm still running the importing job, I'll wait until that finishes and give this a whirl. I assume it reads most of the data pages (at least, that's what I assume given that starting it spiked my SQL Server activity). Looks intesting though, given output from a smaller table.
Eric H.
Read up on sys.dm_db_index_physical_stats in BOL while you're loading. I did't pick out all the information it provides, and there may be stuff in there that applies to your environment.
Philip Kelley
Just finished with a ALTER INDEX REORGANIZE, and the table is about the size I was expecting (i.e. got cut in half).
Eric H.
A: 

For your table, yes, 124 bytes does appear to be the correct row size, and since your clustered index is unique, you shouldn't be wasting space on a uniqueifier. So let's consider how it fits together:

  • Page size = 8 KB (8192 bytes)
    • Header = 96 bytes
    • Available for data = 8096 bytes
  • Row size (fixed data) = 124 bytes
    • Header = 4 bytes
    • Null bitmap = 5 bytes (for 21 columns)
    • Variable data size = 2 (for 0 variable columns)
    • Total = 135 bytes
  • Rows per page = (8096 / 137) = 59
  • Total rows = 117076054
  • Total pages = 117076054 / 59 = 1984440
  • Actual size = 1984440 * 8 KB = 15875520 KB

(Note: calculations are derived from Estimating the Size of a Clustered Index)

So you can see from this that the absolute minimum ratio you'd be able to achieve (using the more simplistic math of total data size / max row size) is approximately 139 bytes per row.

Of course, you say that you're seeing these statistics immediately after inserting a bunch of data - data for which the clustering key is not on an auto-incrementing (IDENTITY or NEWSEQUENTIALID) column and will therefore may not be inserted in a truly sequential fashion. If that's the case, you are probably suffering from a huge number of page splits and need to defragment the clustered index:

ALTER INDEX idx
ON OHLC_Bar_Trl
REORGANIZE -- or REBUILD

Note: I'm not sure if this command is available on SQL Server 2005. The older syntax is:

DBCC INDEXDEFRAG('MyDB', 'OHLC_Bar_Trl', indexnum)

You may also need to shrink the database to reclaim all of the lost space (although most people will recommend against shrinking the data, unless you have a very good reason to do so).

Aaronaught
Yeah, those calcs look right.... do I always need the 5 byte null array (there are no nulls in the table)? Not like 5 bytes is going to break me, either way. I asked in the other answer, I'll ask here too: do you know if the REORG/REBUILD requires enough space to copy the table, or is it done in place? I might have to either BCP out my data and read it back in, or change my inserter program to cache results until it has enough to write them out in order, otherwise. I don't have enough space to do 35Gb table with copies, and god only knows what goes into the log ......
Eric H.
@Eric: That 5 bytes is "reserved space" - if you actually start adding a lot of nullable columns then the number will grow. As for your second question, `ALTER INDEX REORGANIZE` requires some extra space but not the entire size of the table, `REBUILD` probably requires the full unfragmented size of the index. If this data isn't in production already, though, then you can just drop and recreate the index, that's equivalent to rebuilding but won't require any extra space AFAIK.
Aaronaught
ALTER INDEX started with SQL Server 2005
gbn