views:

2182

answers:

11

I'm looking for a drop in solution for caching large-ish amounts of data.

related questions but for different languages:

Close question in different terms:

I don't need (or want to pay anything for) persistence, transactions, thread safety or the like and want something that is not much more complex to use than a List<> or Dictionary<>.

If I have to write code, I'll just save everything off as files in the temp directory:

string Get(int i)
{
   File.ReadAllText(Path.Combine(root,i.ToString());
}

In my cases in index will be an int (and they should be consecutive or close enough) and the data will be a string so I can get away with treating both a POD and would rather go ultra-light and do exactly that.

The usage is that I have a sequence of 3k files (as in file #1 to #3000) totaling 650MB and need to do a diff for each step in the sequence. I expect that to total about the same or a little more and I don't want to keep all that in memory (larger cases may come along where I just can't).


A number of people have suggested different solutions for my problem. However none seem to be targeted at my little niche. The reasons that I'm looking at disk backed caching is because I'm expecting that my current use will use up 1/3 to 1/2 of my available address space. I'm worried that larger cases will just flat run out of space. I'm not worried about treading, persistence or replication. What I'm looking for is a minimal solution using a minimum of code, a minimal usage foot print, minimal in memory overhead and minimum complexity.

I'm starting to think I'm being overly optimistic.

A: 

you can use the MS application block with disk based cache solution

ooo
A: 

Try looking at NCache here also.

I am not affiliated with this company. I've just downloaded and tested their free express version.

Saif Khan
+1  A: 

Disclaimer - I am about to point you at a product that I am involved in.

I'm still working on the web site side of things, so there is not a lot of info, but Serial Killer would be a good fit for this. I have examples that use .Net serialization (can supply examples), so writing a persistent map cache for .Net serializable objects would be trivial.

Enough shameless self promotion - if interested, use the contact link on the website.

Daniel Paull
BCS
SerialKiller is pretty damn light - I'd hate for you to dismiss it for that reason! The interface is basically a mapping from a key (system generated) to a binary stream.
Daniel Paull
The naive, probably buggy and extendability version of what I'm looking for (skipping the eviction policy stuff) could be done in about 30 LOC. I'd be impressed if you could get even half your feature list in nder that.
BCS
By "light" I refer more to runtime overheads, which are very low. I haven't counted LOC, but the DLL's are under 500kb in total, which given the capability, is very lean.
Daniel Paull
skipping of iteration and recursion (unneeded in this case) LOC ~ execution time (for some values of LOC :)
BCS
I disagree. Overcoming issues of fragmentation in the file system and caching strategies greatly affect execution time, so performance may be inversely proportional to LOC!
Daniel Paull
+3  A: 

What you really want is a B-Tree. That's the primary data structure that a database uses. It's designed to enable you to efficiently swap portions of a data structure to and from disk as needed.

I don't know of any widely used, high quality standalone B-Tree implementations for C#.

However, an easy way to get one would be to use a Sql Compact database. The Sql Compact engine will run in-process, so you don't need a seperate service running. It will give you a b-tree, but without all the headaches. You can just use SQL to access the data.

Scott Wisniewski
I'm not liking the overhead. See my edits but I could get away with a single in memory array look up and a single disk read per load so the B-Tree is overkill... in my case.
BCS
One advantage to using the in-proc DB is that it gives you access path independence. When you need to change what you data you store, or what keys you need to access it, you don't need to re-write a big chunk of your app
Scott Wisniewski
However, if you really feel that the stuff you need to do with the data is that simple, then I would think you could something from scratch that used Dictionary(of int, string), where the string was a file name, in about 2-3 hours of work....
Scott Wisniewski
A: 

I've partially poprted EhCache Java application to .NET The distributed caching is not yet implemented, but on a single node, all original UnitTests pass. Full OpenSource:

http://sourceforge.net/projects/thecache/

I can create a binary drop if you need it (only sourcecode is availble now)

Timur Fanshteyn
looks like a neat project. OTOH it looks like overkill for me.
BCS
A: 

I'd take the embedded DB route (SQLite, Firebird), but here are some other options:

Mauricio Scheffer
A: 

I recommend the Caching Application block in the Enterprise Library from MS. That was recommended as well, but the link points to an article on the Data Access portion of the Enterprise Library.

Here is the link to the Caching Application Block:

http://msdn.microsoft.com/en-us/library/cc309502.aspx

And specifically, you will want to create a new backing store (if one that persists to disk is not there):

http://msdn.microsoft.com/en-us/library/cc309121.aspx

casperOne
A: 

Given your recent edits to the question, I suggest that you implement the solution noted in your question as you are very unlikely to find such a naive solution wrapped up in a library for you to reuse.

Daniel Paull
Good chance I will. If I do, I'll post the code.
BCS
+1  A: 

This is very similar to my question

Looking for a simple standalone persistant dictionary implementation in C#

I don't think a library that exactly fits what you want exists, maybe its time for a new project on github.

Sam Saffron
Added link. How about you add a link the other way?
BCS
OTOH the motivation is different. you were looking for persistence, I'm wanting to store stuff on disk rather than in memory. Large overlap, but not quite the same.
BCS
No worries, I added a link from my post
Sam Saffron
+1  A: 

Here is a B-Tree implementation for .net: http://bplusdotnet.sourceforge.net/

Luke Quinane
An interesting project but still a lot heavier than I was looking for.
BCS
A: 

Hey BCS, what solution did you finally end up using for this issue?? Thanks!

Gulsharan