views:

109

answers:

4
+1  Q: 

Caching for poll?

I'm building a poll widget using ASP.NET controls and Linq-to-Sql for a high traffic site. The widget is, actually, already built. But, it does not use caching yet.

This poll can work in a multi-poll mode which means that each page load the control will hit the database to find any polls that the current user has not taken. There are also several database hits on the postback: a check to make sure the user has not taken the poll, a hit to write the result to the database, and a final series of hits to tally the results.

Update, I've re-worded these questions:

  1. Would it be appropriate for a control such as a Poll to hit the database on every page hit? How would this performance scale up to a size of say 20,000 users. Assume the server has 2 servers, a load balancer, modern multiple core cpu, and 2 gig of ram.

  2. What type of caching for this scenario would you look to employ? Take into account that for example any number of people could take the poll over any interval of time and the total number of people who have taken the poll is needed to compute the results. More problematically, on every load the code must hit the database to find the polls that the user hasn't taken.

I've some ideas but wanting to get some additional expert feedback. Thanks.

Update:

So, let us go over a scenario for caching. One could cache the Polls (the questions) but would still need to probably hit the database for the PollsTaken (the users responses). One possibility would be to create a shadow, writing both to an in-memory storage and to the database storage.

One could use a refresh scheme to dump the cache when a user submits a successful poll (when it changes). A cookie could be used to prevent multiple-takes, although it would be susceptible to gaming.

I want to go into and see more details on the scheme offered. For example, how you would use output caching, caching the linq-to-sql, etc. Not just generalities.

A: 

your first question is completely ambiguous and is certainly not answerable without knowing a complete design of your database.

As for caching, that depends on you're architecture as well and how important it is to ensure that a user hasn't taken a poll. Anytime you implement caching you run the risk of stale data and you have to weigh the risks from those.

That being said, if stale data isn't as important i'd try and identify when do you really need to talk to the database, what can be done in idle time and what is constant.

  1. Constant info, such as rules, poll questions,etc.. can/should be cached.

  2. Business logic should be verified at time of execution. You can partially cache these but it's always a good idea to double check befoer you let someone do it. for example, you could identify a list of polls that someone hasnt taken, but when someone wants to actually take the poll verify again that this specific poll hasnt been taken.

brian brinley
+1  A: 

How many database hits can SQL Server/ASP.NET handle before slowing down on a "typical" server?

It's impossible to answer this question, especially since you don't give any information, not even the version of the SQL Server you use.

What is "typical server"? What are precisely the database hits you are talking about? How the database is designed? What is the speed of the network between the SQL Server machine and ASP.NET server machine? What is the bottleneck actually? What SQL Profiler says? (And there are dozens and dozens of other questions like this to ask)

What type of caching for this scenario would you look to employ?

Since you want to reduce the number of requests to the database, take in account that:

  • Once you load the list of polls that the current user has not taken, you don't have to reload this list until the user takes a pool.
  • If you cache the list of pools to take, on a postback, you don't have to check if the user has not taken the pool: if it is in the list, then it's fine.

Finally, I don't think you can avoid the hit to the database when the results are saved. But hits used to get the results might be useless: since the user just completed the poll, your application already knows the results.

MainMa
@Mainma "if you cache the list of polls to take, on a postback, you don't ahve to check if user has taken poll". This is not true. If you hit reload page in Firefox then it will resubmit the data and the event for a submit button will be re-triggered. The only sure way to avoid multiple poll takes is to hit the database (or in-memory database) and verify that the user has not taken the poll. A cookie would not be a sure-thing but could be an in-between solution.
Curtis White
@Curtis White: when the poll is submitted for the first time, the cached data must of course be updated. So on second submit, the poll will be marked as already submitted, thus avoiding multiple submission.
MainMa
+2  A: 

How many database hits can SQL Server/ASP.NET handle before slowing down on a "typical" server?

Define typical server. Can I assume a dual quad core server with 64 gb ram and enough discs to handle the IO load (say space for 24 discs for a standalone system)? That is my typical standalone front end server / performance database system. You will find a lot of other people here have other "typical" servers that vary widely.

What type of caching for this scenario would you look to employ?

The old rule is: Cache as much as you can as early as you can. Like IIS output caching beats everything else. Data caching beats hitting the database etc.

So, try to cache as much as possibly through IIS output caching.

Update:

So, let us go over a scenario for caching. One could cache the Polls (the questions) but would still need to probably hit the database for the PollsTaken (the users responses).

No, you can also cache the results for let's say one minute. You really think people are interested in the LAST SECOND ACTUAL RESULT on an active poll? Delviering new results every minute (or 15 seconds etc.) is totally fine.

And will SIGNIFICANTLY reduce server load.

TomTom
@TomTom I've updated the question and will add more information.
Curtis White
I added more info to my answer.
TomTom
+1 for "Delviering new results every minute (or 15 seconds etc.) is totally fine." Properly identifying what, when, and for how long to cache is critical and business context-sensitive.
JustLoren
A: 

I would cache the following:

  • A list of all poll id's (key would be something like "all_polls")
  • All polls with their results (key would be "poll_<ID>")
  • Lists of IDs of poll users have completed (key would be "users_polls_<USER_ID>")

On page load get the list of all poll id's, filter it with the list of id's for polls this user have completed, and then request the polls from cache by the id's.

On postback I would expire the key for the poll, and the key for the user, as well as submitting to the database. On next request, the keys would be missing in the cache and they will be recreated from the database with the updated results. If you want to you can also update the results in cache, directly on the postback but the normal solution is to just expire the keys.

The problem with this is mainly that you are using two webservers, so you cannot just cache these items in memory. A poll updated on one webserver, would not be expired on the other webserver unless you are employing some form of communication between the servers to synchronize their cache.

I would recommend using an external cache, and cases like this I use memcached myself. If you install a memcached server on each of the webserver hosts, and configure the application to use both memcaches, then you will always have a synchronized cache.

For C# you can use the Enyim Memcached Client (http://memcached.enyim.com/) to connect to the server, and the Northscale Memcached Server (http://www.northscale.com/products/memcached.html) for the servers.

Both the Enyim and Northscale tools are free (open source) and both are very stable and very usable in production. And, no, I'm not employed by either company :-)

AHM
@AHM Thanks. Sorry you didn't get the credit. didn't realize anyone else had answered. this is closer to what I expected.
Curtis White
No problem, I'm just happy to be of assistance. Hope it helps :-)
AHM