views:

160

answers:

4

In my SQL table:

Period| Brand A small Bags| Brand A big bags| Brand D Shoes| ...|  Brand X Shoes
2010  |   10              | 20              | 30           | ...| 200           

How do I sum columns that contains certain words (e.g shoes) in the column names ?

Expected results:

Period | Sum of Bags | Sum of Shoes | ..
2010   | 30          | 230          | ..
+10  A: 

You can't without dynamic SQL and querying the information_schema tables or hardcoding it. Can you restructure your data, perhaps to something like the following?

Brands (BrandId, BrandName)
ItemTypes (ItemId, ItemName)
Stock(BrandId, ItemId, Period, Quantity)

Edit

There's probably a load more reasons as well but here are some reasons why your current table structure is problematic.

  • The maximum number of columns in a table is 1024 so you can't keep adding columns for every brand/item permutation that ever occurs.

  • If a brand makes bags and shoes you will be repeating the information about the brand multiple times. Any time such information is repeated there is a possibility that slight differences and anomalies creep in. Additionally if a brand changes their name you need to update the column and all code you have that references it.

  • You can't do simple calculations such as summing all bags without hard-coding all the permutations and remembering to update that code when you add a new brand.

  • You can't do some queries at all (or at least not without great difficulty) - such as bring back the name of all brands that supply shoes.

Martin Smith
how do i do it with dynamic sql? thankyou :)
marilyn
@marilyn - Whilst it's possible with dynamic SQL it is not advisable! The best approach would be to restructure your data as suggested above rather than trying to extract this data by parsing the column names.
Martin Smith
"You can't do some queries at all (or at least not without great difficulty)" -- nail on the head: if you can write simple SQL DML such as 'sum for shoes in period 2010' then your SQL DDL is wrong.
onedaywhen
A: 
SELECT PERIOD, [Brand A small Bags] + [Brand A big bags] [Sum of Bags] ... etc
rdkleine
+2  A: 

You can't use wildcards on column names (only on content).

The number of columns shouldn't change, so you can construct a query along the lines of:

select period, sum([Brand A small bags]) + sum([Brand A big bags]) as [sum of bags],
   sum([Brand D shoes]) + sum([Brand X shoes]) as [sum of shoes]
from yourtable
group by period

But it would be better to restructure your data, as Martin Smith suggests.

Hans Kesting
+1  A: 

Well, I wanted to try to solve this as asked as an exercise for myself because I'm learning SQL in more depth. I think the problems with the given schema are well-enough documented for me to not say anything else about that.

For this code to work, the columns in the source data must be physical columns (i.e., not a table variable or columns in a view). You could abstract this code into functions and whatnot if there are more types than just bags and shoes. I just wanted to get the algorithmic stuff down. There are many other caveats for this to work properly, but as has been mentioned, denormalized data has a full set of caveats as well.

So here we go:

EDIT: Version 2. Thanks to Martin for helping with this. That is a really neat trick, although it's probably a rare thing to have to use.

I'm assuming the column names will match the search pattern (it's hard-coded enough, so why bother checking).

DECLARE @sql nvarchar(max)
SET @sql = 'SELECT Period, '


-- Build column sum for bags
DECLARE @bagsColumns nvarchar(max)

SELECT
    @bagsColumns = COALESCE(@bagsColumns + '+', N'') + '[' + COLUMN_NAME + ']'
    FROM INFORMATION_SCHEMA.COLUMNS
    WHERE TABLE_NAME = 'SumTest' AND COLUMN_NAME LIKE '%bags%'

SET @sql = @sql + @bagsColumns + ' AS ''Sum of Bags'', '


-- Build column sum for shoes
DECLARE @shoesColumns nvarchar(max)

SELECT
    @shoesColumns = COALESCE(@shoesColumns + '+', N'') + '[' + COLUMN_NAME + ']'
    FROM INFORMATION_SCHEMA.COLUMNS
    WHERE TABLE_NAME = 'SumTest' AND COLUMN_NAME LIKE '%shoes%'

SET @sql = @sql + @shoesColumns + ' AS ''Sum of Shoes'''


SET @sql = @sql + ' FROM SumTest'

EXEC(@sql)
Jon Seigel
You can avoid a cursor `DECLARE @BagSum nvarchar(max); SELECT @BagSum = COALESCE(@BagSum + '+','SELECT Period,') + '[' + COLUMN_NAME + ']' FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 'BagsAndShoes' AND COLUMN_NAME LIKE '%bags%'; IF (@@ROWCOUNT > 0) BEGIN SET @BagSum = @BagSum + ' AS SumOfBags FROM BagsAndShoes' ; EXEC sp_executesql @BagSum END`
Martin Smith
@Martin: Thank you very much for the tip! As I said, I'm still learning. I updated my answer with version 2.
Jon Seigel
Looks good. It would be possible to do it with one query of `INFORMATION_SCHEMA.COLUMNS` (below) but it may be clearer as you have it.
Martin Smith
`DECLARE @Bg nvarchar(max); DECLARE @Sh nvarchar(max); DECLARE @Dyn nvarchar(max); SELECT @Bg = CASE WHEN COLUMN_NAME LIKE '%Bags%' THEN ISNULL(@Bg + '+','') + '[' + COLUMN_NAME + ']' ELSE @Bg END,@Sh = CASE WHEN COLUMN_NAME LIKE '%Shoes%' THEN ISNULL(@Sh + '+','') + '[' + COLUMN_NAME + ']' ELSE @Sh END FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 'BagsAndShoes' AND COLUMN_NAME LIKE '%Bags%' OR COLUMN_NAME LIKE '%Shoes%' SET @Dyn = 'SELECT Period' + ISNULL(',' + @Bg + ' AS SumOfBags', '') + ISNULL(',' + @Sh + ' AS SumOfShoes', '') + ' FROM SumTest' EXEC sp_executesql @Dyn`
Martin Smith
@Martin: Yeah, I think the 2nd version is clearer, and easier to abstract into a function taking (tablename, column match string), though the 3rd version is more compact. I'd go for clarity here because performance is most likely a non-issue.
Jon Seigel