views:

341

answers:

3

I'm creating a cache which is going to contain records in Delphi 2007.

Each record contains a String, 2 Dates and a value.

Type MyRecord = Record
    Location : String;
    Date1 : TDateTime;
    Date2 : TDateTime;
    Value : Double;
End;

There are no guarantees on what the maximum size of the cache will be.

It is highly likely that Location will have several entries for different dates
There are only 13 locations.

The cache needs to be searchable and will be in a performance critical place.

I was thinking of creating a 2 dimensional array for this structure with a sorted string list as an index. So when search I'd access the Stringlist to look up the index I need in the array with name value pairs. (Location = Index) Then I'd need loop through the items for each location to see if the value is in the cache matching both Date1 and Date2. If the value isn't in the cache I need to go and get it from a database and add it to the cache.

Something Like

Type MyRecord = Record
    Date1 : TDateTime;
    Date2 : TDateTime;
    Value : Double;
End;
...
Cache: Array[1..13] of Array of MyRecord
Locations: TStringList;

as location would be in the string list.

Would this be an efficient structure to use for caching?

+1  A: 

Your idea seems sound and should work efficiently. In essence, you would be implementing a simple database table with an index. It separates the index information from the data so that the cost of updating the index is "small" relative to moving data around in a sorted structure.

Another possibility is to use an in-memory database. There are a number of those available for Delphi. They would do a lot of this for you and possibly provide greater flexibility.

Mark Wilkins
+3  A: 

Your structure is efficient enough for caching, but I wouldn't use this in a performance critical place. If your cache grows, and you have 5000 items on one location, you're still doing a linear search through 5000 items.

I think it's better to sort your list and use a binary search to search for items in the cache.

If I would implement something like that, I would take a TList with Pointers to the records. The list than would be sorted with TList.Sort which I give a procedure that sorts the list according to the data the records contain. The sorting will take place on the field with the most 'selectivity', then on the fields with the second most selectivity and so on.

If you want to find an entry you perform a binary search on the list and get the value, if it doesn't exist, you get it from the database and add it to the cache.

Of course this would all be nicely wrapped in a class which takes care of this and memory allocation issues.

A hashmap is also possible, but you have to do tests to see which is faster.

The_Fox
This came closest to the implementation I used: I kept the StringList as an Index (Sorted so that it performed it's own binary search) but swapped out the second dimension of the array for a TList with a custom sort routine and binary search for find. Also kept an array of boolean so it only sorted the list if a new item had been added since the last sort.
JamesB
You don't have to keep a boolean to sort the list when a new item has been added. Use your binary search algorithm to find the index of the item you want to add and insert it there. Take a look at how adding strings is implemented in a sorted TStringList. Your list will always be sorted like that.
The_Fox
A: 

If performance is a concern, avoid the string comparisons as much as possible. I would instead sort the cache array into whatever search order you want, and perform a binary search against the raw data.

If the string value is most important, then use something like the soundex algorithm to split the string into a single character and a number and encode both into a word or integer (simple hash). sort the array by this and on any collisions sort by location string. This way your not performing string matches against obvious non-matches.

skamradt