views:

85

answers:

2

need help with logging all activities on a site as well as database changes.

requirements: * should be in database * should be easily searchable by initiator (user name / session id), event (activity type) and event parameters

i can think of a database design but either it involves a lot of tables (one per event) so i can log each of the parameters of an event in a separate field OR it involves one table with generic fields (7 int numeric and 7 text types) and log everything in one table with event type field determining what parameter got written where (and hoping that i don't need more than 7 fields of a certain type, or 8 or 9 or whatever number i choose)...

example of entries (the usual things):

[username] login failed @datetime
[username] login successful @datetime
[username] changed password @datetime, estimated security of password [low/ok/high/perfect]  @datetime
[username] clicked result [result number] [result id] after searching for [search string] and got [number of results] @datetime
[username] clicked result [result number] [result id] after searching for [search string] and got [number of results]  @datetime
[username] changed profile name from [old name] to [new name]  @datetime
[username] verified name with  [credit card type] credit card  @datetime
datbase table [table name] purged of old entries @datetime via automated process

etc...

so anyone dealt with this before? any best practices / links you can share?

i've seen it done with the generic solution mentioned above, but somehow that goes against what i learned from database design, but as you can see the sheer number of events that need to be trackable (each user will be able to see this info) is giving me headaches, BUT i do LOVE the one event per table solution more than the generic one.

any thoughts?

edit: also, is there maybe an authoritative list of such (likely) events somewhere?

thnx

stack overflow says: the question you're asking appears subjective and is likely to be closed.
my answer: probably is subjective, but it is directly related to my issue i have with designing a database / writing my code, so i'd welcome any help. also i tried narrowing down the ideas to 2 so hopefully one of these will prevail, unless there already is an established solution for these kinds of things.

+1  A: 
  1. Logging database changes as far as inserts/deletes/updates, as far as best practices go, is usually done by a trigger on the main table writing entries into a audit table (one audit table per real table, with identical columsn + when/what/who columns).

  2. The list of events as a generic list doesn't exist. It's really a function of your application/framework/environment/business needs. As far as best practices, it's a good idea to decide if your event type list is 100% flat, a 2-level hierarchy (type/subtype - this is usually the best approach) or an N-level hierarchy (much harder/less efficient to implement but incredibly flexible and offers very nice possibilities for proper enterprise event management - I had participated in implementation of all 3 schemes, so I speak from practice BTW).

  3. You don't need 7 generic int fields in 1 table to store event details. Instead go for tag-value-pair table:

 EVENT_TYPES: (event_type, event_subtype, description, subtype_attr1, ...)
 EVENTS: (event_id, event_type, event_subtype, timestamp, attrib1, ...)
 EVENT_DETAILS: (event_id, tag, int_value, varchar_value, float_value).
 

EVENT_DETAILS can be normalized into EVENT_DETAILS_INT, EVENT_DETAILS_VARCHAR, EVENT_DETAILS_FLOAT, ... if you wish but not really required.

attrib1-atttribN in EVENTS table are generic attributes that apply to all/most events, such as userid, hostname, pid, etc...

EVENT_TYPES is a table describing assorted event types/subtypes.

Depending on how you decided bullet point #2, this table may store flat list of types, a list of type/subtype mappings as in my example, or a hierarchy of parent type/child type (you will need 2 tables for that, one for parent/child mapping of types and one for type attributes of each type).

You may want to have another auxiliary table EVENT_TYPE_ATTRIBUTES mapping event types to valid tags for EVENT_DETAILS.


EXAMPLE:

event: [username] clicked result [result number] [result id] after searching for [search string] and got [number of results] @datetime

This would result in data similar to this (not actual SQL syntax, sue me :):

EVENT_TYPES: (USER_ACTION, USER_CLICK, "User clicked something")
EVENTS: (12345, "USER_ACTION","USER_CLICK", @datetime, "[username]", 
         "app_name", "pid"...) 
EVENT_DETAILS: several rows:
 (12345, "result_number", 33, NULL, NULL) // Or go into EVENT_DETAILS_INT without NULLs? 
 (12345, "result_id", 919292, NULL, NULL)  
 (12345, "search_string", NULL, "how do I log events in DB", NULL)
DVK
i'm sorry, **i think i get your idea somewhat and i do like it**, just am not sure with the table 0 and table 1. for example if i had an event: **[username] clicked result [result number] [result id] after searching for [search string] and got [number of results] @datetime** could you maybe update your answer for this one? i do not get the event type and event subtype especially what would go to type and what to subtype and therefore what goes to table 0 what to table 1. thank you for what seems like a really elegant solution.
b0x0rz
oh and i'll be also using the point 1 that you suggested!
b0x0rz
TYPE would be possibly USER_ACTION and SUBTYPE might be USER_CLICK. But this is REALLY REALLY dependent on your problem domain and only you can really determine.
DVK
yes, i like that. thank you again. accepted the answer.
b0x0rz
The EVENTS table would get an entry (12345, "USER_ACTION","USER_CLICK", "[username]", "app_name", "pid"...)and EVENT_DETAILS would have rows such as (12345, "result_number", 33), (12345, "result_id", 919292), etc.... - hope
DVK
THNX for such a detailed info!!
b0x0rz
A: 

Did you have a look at MongoDB yet? It's very fast on inserts as it doesn't have a few features that relation databases like MySQL have, and of course you can search on any fields you have in your data. Actually most companies start to use it for logging and analytics first before handling more complex applications with it.

Thomas R. Koll
i am using mssql.
b0x0rz