views:

35

answers:

2

Here's a simplified example.

UserDetails (UserID INT UserGUID UNIQUEIDENTIFIER Name VARCHAR(50) Age INT)
UserRatings (UserID INT Rating INT Date DATETIME)
UserVotes   (UserID INT Votes INT Date DATETIME)

The UserGUID only exists in the main table.

In the application layer only GUID's are parsed around, never INTS. This is to prevent competitors from guessing at some key numbers on my site like (e.g. user count).

In the SPROCS relating to tables UserRatings and UserVotes, I have to DECLARE UserID and SET it based on the UserGUID that is getting parsed back at the start of every procedure.

This occurs not just for the Users table but for every other object that has a main table and branched tables so there are tons of these DECLARE/SET type procedures all over the place.

Is this a good design?

+1  A: 

I would remove the uniqueidentifier UserGUID column and just use a the plain int UserID. If you want to make the IDs random to confuse/obfuscate, then just remove the identity and use some function/procedure that randomly generates a plain int with a value between 1 and 2,147,483,647 every time you INSERT a new user.

KM
Won't that mess with the indexes assuming I have clustered index on the UserID column
Nai
@Nai: yes, if you suddenly have random values for your UserID INT, you have the same problems as if you use a GUID as your Primary/Clustering key: due to the randomness, the table and index fragmentation will go through the roof....
marc_s
@marc_s: So I would need a RandomUserID column together with a UserID column to make it work? One for display and one for processing?
Nai
+3  A: 

The UserGUID only exists in the main table. In the application layer only GUID's are parsed around, never INTS. This is to prevent competitors from guessing at some key numbers on my site like (e.g. user count).

Seems quite reasonable to me.

In the SPROCS relating to tables UserRatings and UserVotes, I have to DECLARE UserID and SET it based on the UserGUID that is getting parsed back at the start of every procedure.

You could also define a view spanning UserDetails+UserRatings, as well as UserDetails+UserVotes (or even all three tables at once), which include the GUID from the UserDetails table. This would make querying the tables easier and you wouldn't have to first extract the ID from the GUID and use that to query the table in question.

Update: if you e.g. need to query UserRatings frequently, you could create a view like this:

CREATE VIEW dbo.UserRatingsWithDetails
AS 
    SELECT 
       ud.UserGuid, ud.Name, ud.Age,
       ur.UserID, ur.Rating, ur.Date
    FROM
       dbo.UserDetails ud
    INNER JOIN
       dbo.UserRatings ur ON ur.UserID = ud.UserID

and then you can select from that view without running a separate SELECT first:

SELECT UserID, Name, Rating, Date
  FROM dbo.UserRatingsWithDetails
 WHERE UserGuid = @SomeGuidValue

With the right indices (on the foreign key field in UserRatings), this is a very highly performant JOIN - no worries here!

marc_s
Im not worried about maintainability but rather performance/speed. I really dont like the fact that I need to run a SELECT statement each time to get the UserID each time but it seems unavoidable?
Nai
@Nai: see my update to my answer - no extra SELECT needed - just a nicely created VIEW - works wonders!
marc_s
@marc_s: okay, just to play devil's advocate right. So we have the table UserRatingsWithDetails, if I were to do my DECLARE/SET jiggy again, the WHERE clause will be searching on an indexed column as opposed to a GUID which makes for bad indexing. I guess this is where testing takes over?
Nai
@marc_s: basically, would the extra DECLARE/SET which is able to make use of an INT datatype in the WHERE clause outperform a straight WHERE on a GUID datatype column.
Nai
@Nai: impossible to answer - you'll need to check that by testing the two approaches against each other and see which one is faster in your case.
marc_s
@Nai: you can always put an index on your GUID column - yes, it will be fragmented, but as long it's not the **clustered** index, that's not all that bad, really. With this, searching directly for your UserRatings and specifying the GUID might be quite good on performance.
marc_s
@marc_s: thats what I was thinking as well. thanks!
Nai