I'm not fully getting all i want from Google analytics so I'm making my own simple tracking system to fill in some of the gaps.
I have a session key that I send to the client as a cookie. This is a GUID. I also have a surrogate IDENTITY int column.
I will frequently have to access the session row to make updates to it during the life of the client. Finding this session row to make updates is where my concern lies.
I only send the GUID to the client browser:
a) i dont want my technical 'hacker' users being able to guage what 'user id' they are - i.e. know how many visitors we have had to the site in total
b) i want to make sure noone messes with data maliciously - nobody can guess a GUID
I know GUID indexes are inefficnent, but I'm not sure exactly how inefficient. I'm also not clear how to maximize the efficiency of multiple updates to the same row.
I don't know which of the following I should do :
- Index the GUID column and always use that to find the row
- Do a table scan to find the row based on the GUID (assuming recent sessions are easy to find). Do this by reverse date order (if thats even possible!)
- Avoid a GUID index and keep a hashtable in my application tier of active sessions :
IDictionary<GUID, int>
to allow the 'secret' IDENTITY surrogate key to be found from the 'non secret' GUID key.
There may be several thousand sessions a day.
PS. I am just trying to better understand the SQL aspects of this. I know I can do other clever thigns like only write to the table on session expiration etc., but please keep answers SQL/index related.