views:

142

answers:

2

I want to cache data on the client. What is the best algorithm/data structure that can be employed?

Case 1. The data to be stored requires extremely fast string searching capability.
Case 2. The cached data set can be large. I don't want to explode the client's memory usage and also I don't want to make a network and disk access calls which slows down my processing time on the client side

Solutions:

Case 1: I think suffix tree/Tries provides you with a good solution in this case.

Case 2: The two problems to consider here are:

  1. To store large data with minimum memory consumption
  2. Not to make any network calls to access any data which is not available in the cache. LRU caching model is one solution I can think of but that does not prevent me from bloating the memory.

Is there any way to write down to a file and access without compromising the data (security aspect)?

Let me know if any point is not clear.

EDIT: Josh, I know my requirements are non-realistic. To narrow down my requirement, I am looking for something which stores using LRU algorithm. It will be good if we can have dynamic size configuration for this LRU with a maximum limit to it. This will reduce the number of calls going to the network/database and provide a good performance as well.

If this LRU algorithm works on a compressed data which can be interpreted with a slight overhead (but less than a network call), it will be much better.

A: 

Unfortunately, I think your expectations are unrealistic.

Keeping memory usage small, but also not making disk access calls means that you have nowhere to store the data.

Furthermore, to answer your question about security, there is no client side data storage (assuming you are talking about a web-application) that is "secure". You could encrypt it, but this will destroy your speed requirements as well as require server-side processing. Everything stored at and sent from the client is suspect.

Perhaps if you could describe the problem in greater detail we can suggest some realistic solutions.

JoshJordan
I have updated my question above. Perhaps, you are right what i am seeking has to be a blend of the two (network call and ram). Still, I am looking for best algorithms.
Devil Jin
+1  A: 

Check out all the available caching frameworks/libraries - I've found Ehcache to be very useful. You can also have it keep just some (most recent) in memory and failover to disk at a specified memory usage. The disk calls will still be a lot faster then network calls and you avoid taking all the memory.

Ehcache

Gandalf
I have not used Ehcache. Few Questions:1. Does it stores all the memory content in disk2. How much safe is when the data is stored in disk
Devil Jin
It's highly configurable. You can set object lifespans, staleness timeouts, and how many objects (or how much memory) to store before it starts writing to disk [or of course tell it to never write to disk].
Gandalf
For caching on the client? Please explain.
jro
What is there to explain? It's a cache. You build it into your client code.
Gandalf