views:

218

answers:

5

I want to write web app with client Javascript and back-end server (Python). Client needs data from server frequently in AJAX way. Data in DB, and expensive to load for each request.

However, in desktop app I would just load data from DB once to memory and then access it. In web app - the server code runs each time for request so I can't do it (each run has to load from DB to memory again). How can this work? Can a single process run on server or do I have to use something different here?

An example is like auto-complete here on stackoverflow for tags - how is it implemented in the server for fast caching/loading?


I wonder if a data store like memcached is really a good approach for auto-complete? How would you represent the keys for partial matches ?

A: 

Take a look at memcached.

JAB
A: 

I would use a key-value store on server, like for instance memcache (there are other). So that you don't have to fetch the same data more than once (until it expires). Blazingly fast and a quite common solution (Facebook among others use memcache).

windyjonas
thanks please see my question update
zaharpopov
A: 

You might look at other servers like twisted where you customize the server and the server keeps data in memory. If for instance you have chat data that doesn't need to be persisted indefinitely but just kept in memory as long as anyone is interested in new chats. A server like twisted lets you persist stuff in memory just like you would in a desktop app. http://twistedmatrix.com/ - the site is down now.

Michael
A: 

Try Redis (http://code.google.com/p/redis/)

Copied directly from description :

Redis is an advanced key-value store. It is similar to memcached but the dataset is not volatile, and values can be strings, exactly like in memcached, but also lists, sets, and ordered sets. All this data types can be manipulated with atomic operations to push/pop elements, add/remove elements, perform server side union, intersection, difference between sets, and so forth. Redis supports different kind of sorting abilities.

Also Redis is pretty fast!, 110000 SETs/second, 81000 GETs/second in an entry level Linux box.

Gaurav Verma
+6  A: 

Use memcache or similar tool

Each item in the cache has a key and an expiry date and time

You need to make the key useful for your application. A typical key model pattern is Security.Domain.Query.QueryValue for sets, or Security.Domain.ID for individual objects

e.g.

ALL.Product.Q.Red is a set from the Products domain using the query Red for all users

Admin.Product.Q.Blu is a set from the Products domain using the query Blu just for Admin users

ALL.Customer.O.12345 is a single object graph from the Customer domain ID 12345 for all users

You can also add formats to the key if required.

So as your web application makes a request for data for an auto-complete box, the web service handling the call first requests the data from memcache, and if not found or expired, it only then does the expensive database query

e.g. Auto-complete to find products

  1. Request: http://server.com/product?q=gre&format=json
  2. Server generates memcache key ALL.Product.Name.gre.json
  3. Memcache query fails
  4. Generate SQL query Select ID, Name, Price From Product Where Name Like 'gre%'
  5. Format result into Json
  6. Save result into memcache
  7. Return result to browser

and next time

  1. Request: http://server.com/product?q=gre&format=json
  2. Server generates memcache key ALL.Product.Name.gre.json
  3. Memcache query succeeds
  4. Return result to browser

The secret is in have a key model that work for all occurrences of memcache use, and doesn't generate duplicate keys for different concerns. Remember to encode delimiters in your query parameters (in the example the ".")

TFD
What about using the sql or md5(sql) as the key in memcached ?
PHP_Jedi
You can use whatever you like. The above example is contrived. The reason I would not use SQL is that caching is normally at the model level, and knows nothing about SQL. Caching is often a function of you ORM or a thin layer above it. It can also be part of you services (that might call out to multiple models to build a composite object), again no SQL here
TFD
There of course is much more to this subject. Often on set based cache you own store the unique identifiers of object, and not the objects themselves. So a multiple cache request are performed to fill out a set request. This makes it simple for invalidating frequently changed data
TFD