Jul 022011
 

A customer noticed sudden and sharp increase in the database disk consumption. The alarm that went off was the low disk-space monitor. Apparently the database in question left only a few spare GBs on disk. The concerned customer opened a ticked asking two questions: Is the database growth normal and expected? and what’s the average physical storage requirements per row?

The answer to the first question had to do with their particular actions, which was case specific. However, to answer the second, one either has to keep track of each table’s schema, adding the typical/maximum size of each field, calculating indexes and their sizes, or, one could simply do the math on a typical dataset using SQL code. Obviously the latter is the simpler and preferred.

Google returns quite a number of results (1, 2, 3, 4 and 5.) For MS SQL, it seems that virtually all rely on the sp_spaceused stored proc. SQL has an undocumented sproc sp_msforeachtable which runs over each table in the database and executes a sproc passing each table’s name as param. While it isn’t at all difficult to do this manually (looping over sys.Tables is hardly a feat,) calling this one-liner is still very convenient. So no surprise that virtually all samples online just do that.

Here is an sproc that prints the total database size, reserved size, data size, index size and unused sizes. In addition, the sproc prints the same numbers for each table with the total number of rows in all tables at the end.

My prime interest wasn’t just to learn about the database size, which can be achieved using sp_spaceused without any params, nor to just learn about each table’s share, which can be done by passing the table name in question to sp_spaceused. My main purpose was to get a breakdown of the average row-size per table.

So, here is a similar script to do exactly that. The script first updates the page and row counts for the whole database (which may take a long time, so disable on production databases,) in addition, it calculates the totals and averages of each data-point for all tables and calculates the average data size (data + index) and wasted bytes (reserved + unused) per table. All the information for the tables is printed in a single join statement to return a single rowset with all the relevant data.

-- Copyright (c) 2011, Ashod Nakashian
-- All rights reserved.
-- 
-- Redistribution and use in source and binary forms, with or without modification,
-- are permitted provided that the following conditions are met:
-- 
-- o Redistributions of source code must retain the above copyright notice, 
-- this list of conditions and the following disclaimer.
-- o Redistributions in binary form must reproduce the above copyright notice, 
-- this list of conditions and the following disclaimer in the documentation and/or
-- other materials provided with the distribution.
-- o Neither the name of the author nor the names of its contributors may be used to endorse
-- or promote products derived from this software without specific prior written permission.
--
-- THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY
-- EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
-- OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT 
-- SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, 
-- INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
-- PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
-- INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
-- LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
-- OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
--
-- Show physical size statistics for each table in the database.
--
SET NOCOUNT ON

-- Update all page and count stats.
-- Comment for large tables on production!
DBCC UPDATEUSAGE(0) 

-- Total DB size.
EXEC sp_spaceused

-- Per-table statistics.
DECLARE @t TABLE
( 
    [name] NVARCHAR(128),
    [rows] BIGINT, 
    [reserved] VARCHAR(18), 
    [data] VARCHAR(18), 
    [index_size] VARCHAR(18),
    [unused] VARCHAR(18)
)

-- Collect per-table data in @t.
INSERT @t EXEC sp_msForEachTable 'EXEC sp_spaceused ''?'''

-- Calculate the averages and totals.
INSERT into @t
SELECT 'Average', AVG(rows),
    CONVERT(varchar(18), AVG(CAST(SUBSTRING([reserved], 0, LEN([reserved]) - 1) AS int))) + ' KB',
    CONVERT(varchar(18), AVG(CAST(SUBSTRING([data], 0, LEN([data]) - 1) AS int))) + ' KB',
    CONVERT(varchar(18), AVG(CAST(SUBSTRING([index_size], 0, LEN([index_size]) - 1) AS int))) + ' KB',
    CONVERT(varchar(18), AVG(CAST(SUBSTRING([unused], 0, LEN([unused]) - 1) AS int))) + ' KB'
FROM   @t
UNION ALL
SELECT 'Total', SUM(rows),
    CONVERT(varchar(18), SUM(CAST(SUBSTRING([reserved], 0, LEN([reserved]) - 1) AS int))) + ' KB',
    CONVERT(varchar(18), SUM(CAST(SUBSTRING([data], 0, LEN([data]) - 1) AS int))) + ' KB',
    CONVERT(varchar(18), SUM(CAST(SUBSTRING([index_size], 0, LEN([index_size]) - 1) AS int))) + ' KB',
    CONVERT(varchar(18), SUM(CAST(SUBSTRING([unused], 0, LEN([unused]) - 1) AS int))) + ' KB'
FROM   @t

-- Holds per-row average kbytes.
DECLARE @avg TABLE
( 
    [name] NVARCHAR(128),
    [data_per_row] VARCHAR(18),
    [waste_per_row] VARCHAR(18)
)

-- Calculate the per-row average data in kbytes.
insert into @avg
select t.name, 
    CONVERT(varchar(18),
        CONVERT(decimal(20, 2),
            (CAST(SUBSTRING(t.[data], 0, LEN(t.[data]) - 1) AS float) +
             CAST(SUBSTRING(t.[index_size], 0, LEN(t.[index_size]) - 1) AS float)) 
            / NULLIF([rows], 0))) + ' KB', 
    CONVERT(varchar(18),
        CONVERT(decimal(20, 2),
            (CAST(SUBSTRING(t.[reserved], 0, LEN(t.[reserved]) - 1) AS float) +
             CAST(SUBSTRING(t.[unused], 0, LEN(t.[unused]) - 1) AS float))
            / NULLIF([rows], 0))) + ' KB'
from @t t

-- Join the two tables using the table names.
select t.name, t.rows, t.reserved, t.data, t.index_size, t.unused, a.data_per_row, a.waste_per_row
from @t t, @avg a
where t.name = a.name

There is quite a bit of data conversion and casting that isn’t necessarily very performant, but here there isn’t much choice and optimizing further is probably unnecessary. But since there are so many different ways to get the same output, I’ll leave any variations up to the readers. Suggestions and improvements are more than welcome. Please use comments to share your alternatives.

This may easily be wrapped in an sproc for convenience. I hope you find it useful and handy.

QR Code Business Card