views:

39

answers:

1

I'm working on a .NET data logging application that needs to accept data from a large number of clients and store it in a database.

The client sends a start event to the server and then sends heartbeat events, causing the last activity time of the data to be updated. I can't use an end event because the client app may be closed with no chance to send such an event.

A simple approach would be to do a db insert on the start event and then a db update on each heartbeat, but that would be very db-intensive with a heartbeat every few seconds from each of a large number of clients. The updates would also become expensive as the database table gets large.

Thus, I am looking at caching the data in memory and then flushing it to the database when a client has stopped sending heartbeats.

So I need a suitable data structure and strategy for:

  • Creating a session object when a client sends a start event
  • Efficiently updating the objects when heartbeat events are received
  • Identifying sessions that have timed out and saving them to the database

I am thinking something along the lines of a hashtable in memory that is periodically iterated by a timer triggered event to check for timed out sessions.

Does that make sense or is there a better approach to this kind of problem?

A: 

Depending on the nature of the logs, standard approach that is supported (as most logging application need to support a scenario where keeping logs is very important or required by law) is that logs are written to hard disk as fast as possible (bonus points for independent system which is write only from the perspective of the application environment that is logged).

Logs are processed, aggregated or analysed with lower priority.

Of course your use case can be different.

Unreason
I had considered logging to flat log files and processing later but we need to provide almost immediate access to the data, so I ruled that out. Thanks for the suggestion though!
DanK
Also, to confirm, the data in question does not have legal issues that require it to be preserved.
DanK
@DanK, ok then - I think your approach is ok; semantic suggestion - do not call the data logs - if you need it ASAP for further processing the nature of the data is not really a log.
Unreason