I have an application that sends data based on user interaction (not user input). The data sent could be an Integer, String, Date, or Boolean value. There are 140 keys. We may get anywhere from 1 key value pair to all 140 at a time.
We want to store everything but will only be using 20 out of 140 keys within the application. The remaining will be used for an audit trail later on - so we still need to store them.
This data is used by the application to decide where the user needs to go so it needs to access the record by student id and pull the 20 or so options within milliseconds. There could be billions of rows of data (it is an upgrade to an existing application with over 20,000 users) so performance is critical. The user generates a new row each time they access the application.
EXAMPLE DATA:
Score:1
ID:3212
IsLast:False
Action:Completed
I have 2 ideas on how to do this and looking for some help on which is best or is a third option a better choice.
OPTION 1:
My first idea is to use a column for the value as a string then have a look-up table of possible data types to use when the value needs to be Cast for use.
value | dataType
-----------------------
"1" | int
"Completed" | string
While the data being sent is not user generated I know there must be a gotcha somewhere in this method. The only reason for doing this is that we don't know what key:pair will be sent (outside of date and id) and trying to avoid more than a few columns.
The SO Question How to Handle Unknown Data Type in one Table uses a similar idea.
OPTION 2:
The other solution is to have 140 columns - one for each key. However, the amount of data generated is very large (billions of rows) so that calling this data will not be fast enough - I don't think.
Technical Details: This is using SQL Server 2008 - not R2 with DotNet C# and Reporting Services.
Am I missing something here - what is the best way to create this table for performance?