First of all, there's already a problem here because the data is not normalized. Creating any sort of index on a bunch of text columns is something that should be avoided whenever possible. Even if these columns aren't text (and I suspect that they are), it still doesn't make sense to have artist, album and song in the same table. A much better design for this would be:
Artists (
ArtistID int NOT NULL IDENTITY(1, 1) PRIMARY KEY CLUSTERED,
ArtistName varchar(100) NOT NULL)
Albums (
AlbumID int NOT NULL IDENTITY(1, 1) PRIMARY KEY CLUSTERED,
ArtistID int NOT NULL,
AlbumName varchar(100) NOT NULL,
CONSTRAINT FK_Albums_Artists FOREIGN KEY (ArtistID)
REFERENCES Artists (ArtistID))
Songs (
SongID int NOT NULL IDENTITY(1, 1) PRIMARY KEY CLUSTERED,
AlbumID int NOT NULL,
SongName varchar(100) NOT NULL,
NumberOfListens int NOT NULL DEFAULT 0
CONSTRAINT FK_Songs_Albums FOREIGN KEY (AlbumID)
REFERENCES Albums (AlbumID))
Once you have this design, you have the ability to search for individual albums and artists as well as songs. You can also add covering indexes to speed up queries, and the indexes will be much smaller and therefore faster than the original design.
If you don't need to do range queries (which you probably don't), then you could replace the IDENTITY
key with a ROWGUID
if that suits your design better; it doesn't really matter much in this case, I would stick with the simple IDENTITY
.
You have to be careful with clustering keys. If you cluster on a key that is completely not even remotely sequential (and an artist, album, and song name definitely qualify as non-sequential), then you end up with page splits and other nastiness. You don't want this. And as Marc says, a copy of this key gets added to every index, and you definitely don't want this when your key is 300 or 600 bytes long.
If you want to be able to quickly query for the number of listens for a specific song by the artist, album, and song name, it's actually quite simple with the above design, you just need to index properly:
CREATE UNIQUE INDEX IX_Artists_Name ON Artists (ArtistName)
CREATE UNIQUE INDEX IX_Albums_Artist_Name ON Albums (ArtistID, AlbumName)
CREATE UNIQUE INDEX IX_Songs_Album_Name ON Songs (AlbumID, SongName)
INCLUDE (NumberOfListens)
Now this query will be fast:
SELECT ArtistName, AlbumName, SongName, NumberOfListens
FROM Artists ar
INNER JOIN Albums al
ON al.ArtistID = ar.ArtistID
INNER JOIN Songs s
ON s.AlbumID = al.AlbumID
WHERE ar.ArtistName = @ArtistName
AND al.AlbumName = @AlbumName
AND s.SongName = @SongName
If you check out the execution plan you'll see 3 index seeks - it's as fast as you can get it. We've guaranteed the exact same uniqueness as in the original design and optimized for speed. More importantly, it's normalized, so both an artist and an album have their own specific identity, which makes this a great deal easier to manage over the long term. It's much easier to search for "all albums by artist X." It's much much easier and faster to search for "all songs on album Y."
When designing a database, normalization should be your first concern, indexing should be your second. And you're likely to find that once you have a normalized design, the best indexing strategy becomes kind of obvious.