views:

783

answers:

6

I'm in the process of developing a social network site.

And been thinking of scalability from day one of the project, I've fine tuned the site and queries to the best of my ability.

However; Certain pages are very data heavy and I'm not quite sure if they are loading as fast as they could so I was thinking of implementing a distributed caching solution.

But not quite sure what I should cache and not cache. Or if current page load times of 1 second is good or bad.

The heaviest query is grabbing member information this query gets all the member's info and anything related to them such as in this site's case their goals, blog type entries, encouragements, photos, status updates (like twitter), blog info (for crossposting their entries) etc etc.

Anyhow, should I cache this info? And do you think 1 second page load times are reasonably fast? Some pages are less than a second between 4-6 10ths of a second.

+1  A: 

The page loading question was already asked:

What is considered a good response time for a dynamic, personalized web application?

In terms of caching, you have to measure the amount of time that would be spent loading the data each time versus loading from the cache. The bigger the cache, the less effective it becomes. So you don't want to load too much data into the cache. What we use here is a rolling cache, with the least recently used data being dropped once we hit the cache size limit. You then tune the limit according to actual performance results.

Elie
+1  A: 

For user profile specific data, store as much as you can in a FormsAuth ticket / cookie. Caching (HttpContext.Current.Cache) user specific items will require X resources on your server per user, the same as session would (but with less of the headache). If you can offload as much as possible into the users Ticket or cookie (its like 4K) then you will really help your servers performance while scaling.

You should only be retrieving the bits of information that the page needs. For example, if you are loading blog data, DONT load photo's. Its more work, yes, but if you want to scale you will have to analyze the needs for each page.

Make sure you understand there are differences between Database Query Caching (execution plan re-use), Object caching (Httpcontext.Cache), and page caching (Response.Headers[Expires]).

StingyJack
+2  A: 

The typical answer is:

  • Cache information which is rarely updated.
  • Don't cache what change frequently.

In your case, you could cache everything in flat files (one file per user for example) and destroy a user cache file whenever something is updated by the corresponding. If the cache file does not exist, you create it before displaying the associated content.

Now about load time (which can be very different depending on the user location), here are some informative numbers from a PHP forum of a gaming site:

  • JS Load Time: 0.274
  • Query Count: 15
  • PHP Load Time: 0.0524
  • Memory Usage: 1.013 MB

And this is considered by the community as a good experience. But it is terribly subjective.

Veynom
+1  A: 

I'd implement caching at each and every layer of your application if at all possible.

You can cache pages at the highest level, objects at the code level, and ensure your database is caching both queries and key data correctly at the lowest level.

In terms of WHAT you need to cache, any objects that will be repeadedly accessed should be cached, especially those which are unlikely to change very often. You can then reset that object's cache only when it is edited. ( Be cautious of caching objects which are frequently updated as a constant cycle of replacing the cache on almost every load will degrade performance instead of enhancing it)

For measuring performance, I'd not look at how long a single page takes to load, but google for some performance measuring tools as you really need to test how fast each page performs under pressure. Your user info page might not be the biggest caching target if it is rarely accessed for example. You should be focussing on the most heavily used pages.

Matthew Rathbone
Why implement caching if it is not needed? This just introduces an extra level of complexity into the application. If it is determined caching is required, I definitely would address only one layer at a time.
Jason Jackson
I too would be shy of overdoing caching. This is only asking for data integrity issues. The moment you open up this door is also when you have to worry about (in the case of caching BOs) implementing versioning. Best to take it one layer at a time.
Jason Whitehorn
I didn't say that he HAD to implment cacheing, I was giving opinion that if implementing caching, doing so at every layer could yield the best results.
Matthew Rathbone
A: 

Web design guru Vincent Flanders suggests that anything over 4 seconds is too long for a web page to load. I think that this is a pretty good rule of thumb.

As far as caching or other performance optimization go, I would recommend you perform some performance testing. There are a number of performance testing kits available on the market. Testing will show where the problem areas are. There is no sense in you adding caching logic to something that already performs well. On the other hand, performance problems may occur where you might not expect them, involving data you might not have considered caching.

Something I have also found with performance testing is that it can find spots in your code where deadlocks are occurring, or where simple database optimizations would help. Perhaps adding an index to a table will speed up a page instead of adding a bunch of caching logic.

I would keep it simple, and refactor for performance only where you need to do so.

Also, test early and test often. I know a lot of people say to consider performance last, but you can really code yourself into a corner if you don't at least start considering it early in the development life cycle.

Jason Jackson
I tend to disagree with Vincent. I think four seconds is far too long to have to wait for a page load. Would you use a search engine that constantly took four seconds to return results? Two seconds would probably be tolerable; but I believe under one second is the golden time.
Ty
I disagree. We aren't all writing search engines. That is a bad example. Plus, a company like Google has the luxury of having the entire hardware and software stack customized and optimized.
Jason Jackson
A: 

Some research suggests that interactive application must provide feedback within 250ms to not make the user think it is stuck or that the operation failed. The acceptance level on the web is somewhat higher though, and usually feedback is provided by the webbrowser that a new page is loading. With AJAX you have to take care to provide som kind of feedback though because the browser won't show what is happening.

John Nilsson