views:

156

answers:

9

I'm running an ASP.NET app in which I have added an insert/update query to the [global] Page_Load. So, each time the user hits any page on the site, it updates the database with their activity (session ID, time, page they hit). I haven't implemented it yet, but this was the only suggestion given to me as to how to keep track of how many people are currently on my site.

Is this going to kill my database and/or IIS in the long run? We figure that the site averages between 30,000 and 50,000 users at one time. I can't have my site constantly locking up over a database hit with every single page hit for every single user. I'm concerned that's what will happen, however this is the first time I have attempted a solution like this so I may just be overly paranoid.

+2  A: 

Do it Async.

Create a dll that handles the update, and in the page load do a fire and forget with parameters.

chris
A vanilla "fire and forget", such as with BeginInvoke or ThreadPool.QueueUserWorkItem(), still requires a thread from the ASP.NET thread pool, and will still adversely effect performance on a loaded system.
RickNZ
A: 

I think updating the database in the begin session and end session will do the job. that will reduce the count of statements dramatically.

I think it makes no difference if you track hits or begin/end session. with hits you'll also need additional logic to subtract inactive users

EDIT: session end is not fired always. I would suggest to call an update statement/stored procedure in another session begin event (in addition to the other insert statement) that will fix invalid sessions.

I don't think that calling this "fix routine" is necessary in every page load event because I think you cant exactly count "current no. of visitors".

Arthur
Well that's how I got to this point. We were doing that in the first place but Session_End was not always firing. My user counts were going up and up but never coming down until we restarted IIS. Lots of searching showed me that Session_End isn't reliable. That's why I set it up in the page load; it adds or updates the user and then searches for users inactive in the last 20 minutes and deletes them.
mlb
A: 

What's wrong with IIS Logs?

2009-05-01 12:30:31 207.219.27.35 GET /assocadmin/ibb-reg.asp - usernameremoved 544.566.570.575 Mozilla/4.0+(compatible;+MSIE+7.0;+Windows+NT+6.0;+SLCC1;+.NET+CLR+2.0.50727;+Media+Center+PC+5.0;+.NET+CLR+3.5.30729;+.NET+CLR+3.0.30618) 200 0 0 40058

EDIT: I'd like to close this answer, but I want the comments to stay. Consider this answer withdrawn.

chris
It's requested that it be shown on the site at all times...like "3498 Users Currently Online".
mlb
That's a really important requirement. How about posting a question asking how to do that? Someone may have implemented this already, and have a really good idea for it.
chris
Hypothetically, would anyone notice if you just made up the number?
erikkallen
chris - I actually have posted this on three forums and the only answer I can get is to use Session_Start/Session_End, and like I said that wasn't working for us. I can certainly try again though.
mlb
+2  A: 

Insert-Based designs have less locking than Update-Based designs.

So if a user logged-in and then logged-out, in an Insert-Based design you would have multiple rows with a SessionID in each, one for each activity whereas in an Update-Based design, you would have a SessionId, LoginTime and a LogoutTime column and you would update the LogoutTime based on the SessionId.

I have seen many more locking and contention problems caused by Update activity more than Insert activity.

Activities such as counting and linking logins to logouts etc take more complex queries and a little more resources.

It goes without saying that your queries, especially the ones that run on every page, should be as fast as possible so that the site doesn't appear slow to users.

Raj More
While it's true that an insert-only approach might result in less DB contention, it also makes the counting / selecting take longer. More importantly, though, this answer doesn't address the OPs question about whether running this type of query in a page load is a good idea.
RickNZ
+1  A: 

To keep track of how many users are currently on your site you could use performance counters. What you describe though sounds more like a full fledged logging of every page hit.

Lets say you realy have 50k users connected at any one time.

As long as you don't have contention between the updates (trying to lock the same record) a database can track a very high number of inserts and updates. You need to do some capacity planning to assure the load can be carried. 50k users visiting a page every minute will give you 50k inserts and 50k updates per minute, roughly 850 inserts and 850 updates per second, which have to commit (flush the log). Does your DB I/O subsytem support such a write pressure load, in addition to responding to all the requests (reads)?

Also 50k users doing 1 page hit per minute adds up to 72 mil hits per day, 72 mil. logging inserts, at such a rate you need to carefully plan the size capacity of the database and consider what kind of analysis you'll do on the collected data since querying ad-hoc 2 billion rows (one month data) will get you nowhere fast (actually... quite slow).

Doing it async can give you some relief over very short spikes, but not on the long run. If your DB system cannot handle the load then doing async calls will just create a backlog queue in the application process (in the ASP app pool) and this will grow until out of memory, at which moment the all vigilant IIS will 'recycle' the app pool, thus loosing all pending async updates.

Remus Rusanu
A: 

I would keep this in Application state instead - if possible. On ApplicationStart create some data structure saved to App state that you can update from anywhere in your application - session start, page load, wherever. Keep it out of the database. You are just using it to track "currently online" info anyway it sounds like.

If you have multiple instances of your app, or if there is a requirement to maintain historical info beyond the IIS logs, this won't work obviously. Go with chris' fire-and-forget solution in that case.

ScottTx
Thanks Scott, but can you tell me how to keep track of when the user session times out? As I said, Session end doesn't always fire and is giving me false data. The ultimate issue is when a user's session has ended.
mlb
A: 

How about adding a small object to the session?

Something like LoggedInUserFlag:IDisposable

In the constructor, increment your counter however you decide to implement it.

Then in the Dispose method, decrement the counter.

This way, regardless of how the session is ended, the counter will always be (eventually) decremented.

see:

http://weblogs.asp.net/cnagel/archive/2005/01/23/359037.aspx

for info on using IDisposable.

chris
A: 

I am not an ASP guy at all, but what about rather than logging all that other info, and insert their IP address?

If they have an IP address already in there, have a last_seen timestamp, and on each refresh just delete any row that isn't 10 minutes ago?

This is how I would take a shot at it. It is much more space efficient, but I am not sure about the checking and deleting so much on such a high profile site.

Garrett
If the OP doesn't care too much about accuracy, using IP addresses is certainly one way to go. However, unfortunately there are environments where many users share a single IP, such as when the requests come through a proxy (AOL, for example). There are also cases where very large geographical regions filter their traffic through a small number of proxies. Also, IPs can change, even within a single session.
RickNZ
A: 

As a direct answer to your question, yes, running a database query in-line with every request is a bad idea:

  1. Synchronous requests will tie up a thread, which will reduce your scalability (fewer simultaneous activities)
  2. DB inserts (or updates) are writes to the DB, which will put a load on your log volume
  3. DB accesses shouldn't be required in a single server / single AppPool scenario

I answered your question about how to count users in the other thread:

http://stackoverflow.com/questions/1822921/best-way-to-keep-track-of-current-online-users

If you are operating in a multi-server / load-balanced environment, then DB accesses may in fact be required. In that case:

  1. Queue them to a background thread so the foreground request thread doesn't have to wait
  2. Use Resource Governor in SQL 2008 to reduce contention with other DB accesses
  3. Collect several updates / inserts together into a single batch, in a single transaction, to minimize log disk I/O pressure
  4. Return the current count with each DB access, to minimize round-trips

In case it's of any interest, I cover sync/async threading issues and the techniques above in detail in my book, along with code examples: Ultra-Fast ASP.NET.

RickNZ