views:

585

answers:

10

Hi
We have a C++ application for which we try to improve performance. We identified that data retrieval takes a lot of time, and want to cache data. We can't store all data in memory as it is huge. We want to store up to 1000 items in memory. This items can be indexed by a long key. However, when the cache size goes over 1000, we want to remove the item that was not accessed for the longest time, as we assume some sort of "locality of reference", that is we assume that items in the cache that was recently accessed will probably be accessed again.

Can you suggest a way to implement it?

My initial implementation was to have a map<long, CacheEntry> to store the cache, and add an accessStamp member to CacheEntry which will be set to an increasing counter whenever an entry is created or accessed. When the cache is full and a new entry is needed, the code will scan the entire cache map and find the entry with the lowest accessStamp, and remove it. The problem with this is that once the cache is full, every insertion requires a full scan of the cache.

Another idea was to hold a list of CacheEntries in addition to the cache map, and on each access move the accessed entry to the top of the list, but the problem was how to quickly find that entry in the list.

Can you suggest a better approach?

Thanks
splintor

+9  A: 

Scanning a map of 1000 elements will take very little time, and the scan will only be performed when the item is not in the cache which, if your locality of reference ideas are correct, should be a small proportion of the time. Of course, if your ideas are wrong, the cache is probably a waste of time anyway.

anon
And as it just occurred to me and I said in my answer: if you can do asynchronous I/O, then you can scan the cache for something to remove while you're waiting for your new object to arrive. So it's effectively free so long as it takes less time than the I/O, and you aren't also CPU-bound. It could therefore scale up to as many items as you can visit in the time the I/O takes. If you can't do async I/O, you probably have no business optimising the cache ;-)
Steve Jessop
Except that setting up and handling asynch I/O costs time at runtime and effort at design and implementation time. Nothing is "effectively free" in this world. And I disgree quite strongly with your last sentence - the cache may well be useful even in a purely synchronous setting.
anon
I agree the cache may be useful. I just mean that if your cache is under-performing, then you have no business cycle-counting the cache maintenance code before you've at least tried to parallelise the I/O. What I'm talking about in this case, though, is just something like "start the async I/O; expel something from the cache; block on the I/O;". It's a bit more design work than blocking I/O, but with a good async API, not much more. And it might not be more runtime work, since it's not unusual for systems to implement blocking I/O on top of async: the other way around is harder.
Steve Jessop
... the overall point being that you're probably right that scanning 1000 items is trivial, but even if it's not trivial, for instance if the cache size is increased in future, it will often be possible to arrange for that not to matter. In terms of performance, the elephant in the room which the questioner is ignoring, is that fetching data into the cache is almost certainly slow and not CPU-bound.
Steve Jessop
I think I'll go with your advice.Using Async I/O might be a headache here, as we compile using SAS/C compiler and copy it to z/OS mainframe, and I'm not sure we have a decent async I/O library for that.
splintor
+2  A: 

Update: I got it now...

This should be reasonably fast. Warning, some pseudo-code ahead.

// accesses contains a list of id's. The latest used id is in front(),
// the oldest id is in back().
std::vector<id> accesses;
std::map<id, CachedItem*> cache;

CachedItem* get(long id) {
    if (cache.has_key(id)) {
         // In cache.
         // Move id to front of accesses.
         std::vector<id>::iterator pos = find(accesses.begin(), accesses.end(), id);
         if (pos != accesses.begin()) {
             accesses.erase(pos);
             accesses.insert(0, id);
         }
         return cache[id];
    }

    // Not in cache, fetch and add it.
    CachedItem* item = noncached_fetch(id);
    accesses.insert(0, id);
    cache[id] = item;
    if (accesses.size() > 1000)
    {
        // Remove dead item.
        std::vector<id>::iterator back_it = accesses.back();
        cache.erase(*back_it);
        accesses.pop_back();
    }
    return item;
}

The inserts and erases may be a little expensive, but may also not be too bad given the locality (few cache misses). Anyway, if they become a big problem, one could change to std::list.

kotlinski
It's index by the long key, not the access time.
James Curran
The map is not keyed on the value he wants to remove.
anon
I think this code is basically correct. It's a pseudo-implementation of the other idea I mentioned in the post. I am afraid that searching for the accessed items in the accesses list on each access will take too much time, although locality should indeed reduce that time.
splintor
A: 

As a simpler alternative, you could create a map that grows indefinitely and clears itself out every 10 minutes or so (adjust time for expected traffic).

You could also log some very interesting stats this way.

DanDan
Adding a background thread and multi-threading locks to handle this problem seems to be an overkill.
splintor
I assumed your cache would have to be thread safe regardless.
DanDan
No, not at all. The application is currently single-threaded, sorry.
splintor
+1  A: 

Create a std:priority_queue<map<int, CacheEntry>::iterator>, with a comparer for the access stamp.. For an insert, first pop the last item off the queue, and erase it from the map. Than insert the new item into the map, and finally push it's iterator onto the queue.

James Curran
That doesn't work when the access counts get incremented. Priority queues do not adjust themseleves if the things inside them change.
anon
A: 

I agree with Neil, scanning 1000 elements takes no time at all.

But if you want to do it anyway, you could just use the additional list you propose and, in order to avoid scanning the whole list each time, instead of storing just the CacheEntry in your map, you could store the CacheEntry and a pointer to the element of your list that corresponds to this entry.

Alex
But then I'll have to deal with invalidating iterator (assuming I use std:list), as the list is constantly changing as cache entries are being accessed.
splintor
Hmm, yes you're right, I haven't thought about that! But you can always create you own double linked list, which would be easy since the only operations you need are fairly easy.This way you won't lose any references you have. So, what happens is the following:When new entry arrives, create new node at the head of the list. Add in the map the CacheEntry along with a pointer to that node.When an entry is re-accessed, find the node in the list (using key to get the right element in the map and then the pointer that points to the node in the list) and move this element to the head of the list
Alex
That's basically what jk suggested, and we both agreed that maintaining my own linked list, while being fun, might be too risky, especially for the current stage and condition of our project.
splintor
+2  A: 

An alternative implementation that might make the 'aging' of the elements easier but at the cost of lower search performance would be to keep your CacheEntry elements in a std::list (or use a std::pair<long, CacheEntry>. The newest element gets added at the front of the list so they 'migrate' towards the end of the list as they age. When you check if an element is already present in the cache, you scan the list (which is admittedly an O(n) operation as opposed to being an O(log n) operation in a map). If you find it, you remove it from its current location and re-insert it at the start of the list. If the list length extends over 1000 elements, you remove the required number of elements from the end of the list to trim it back below 1000 elements.

Timo Geusch
Age is not the issue. It is the number of accesses to a cached object that is the criteria for retaining it.
anon
Hmmm... OP seems self-contradicting. "...we assume that items in the cache that was recently accessed will probably be accessed again."
kotlinski
I think that by "will be set to an increasing counter", the OP means he has a global increasing counter, and will set accessStamp to the value of that counter. He doesn't say, "accessStamp is an increasing counter".
Steve Jessop
The OP said "we want to remove the item that was not accessed for the longest time", which to me sounded like he was after some sort of Least Recently Used algorithm. Hence the suggestion of keep the the recently used entries at the beginning of the list and remove 'too old' entries from the end of the list.
Timo Geusch
I thought of this solution, but as you said, the search action, which I expect to occur much more than removing item from a full cache, will take longer. That's why I dismissed this option.
splintor
It might still be worth trying out - 1000 elements is not that much and I'd expect even a linear search to be quite fast.
Timo Geusch
+11  A: 

Have your map<long,CacheEntry> but instead of having an access timestamp in CacheEntry, put in two links to other CacheEntry objects to make the entries form a doubly-linked list. Whenever an entry is accessed, move it to the head of the list (this is a constant-time operation). This way you will both find the cache entry easily, since it's accessed from the map, and are able to remove the least-recently used entry, since it's at the end of the list (my preference is to make doubly-linked lists circular, so a pointer to the head suffices to get fast access to the tail as well). Also remember to put in the key that you used in the map into the CacheEntry so that you can delete the entry from the map when it gets evicted from the cache.

jk
I like this - probably the best solution that uses two collections. I'd want to do tests to prove that it actually is faster than the naive map scan implementation, however.
anon
Odd, I missed this answer before writing my own, which is identical. +1, then.
Steve Jessop
This indeed looks like the correct answer for the problem. However, given the tight schedule we are under, and the limited debugging ability (we compile the code using SAS/,C compiler and copy it to z/OS mainframe), it seems too risky to manage our own linked list. Therefore, I think I'll stick with my original design and use Neil Butterworth's advice that scanning a map of 1000 items takes little time.
splintor
So doesn't that make mine the correct answer?
anon
It's hard to tell - as I said, the "correct answer" to my question seems to be this answer, but given my circumstances, your answer suits me better. I wish I could mark both as answers. However, since I'm counting on oyur answer, I guess you deserve that to be an answer...
splintor
I was only joking! Give jk the points - he needs them more than me!
anon
splintor, fair enough: getting a linked list completely correct can be tricky. And Neil, if you're so worried about me getting points, why don't you go upvote some of my other answers? ;-)
jk
OK, Neil - it's your call...
splintor
A: 

I believe this is a good candidate for treaps. The priority would be the time (virtual or otherwise), in ascending order (older at the root) and the long as the key.

There's also the second chance algorithm, that's good for caches. Although you lose search ability, it won't be a big impact if you only have 1000 items.

The naïve method would be to have a map associated with a priority queue, wrapped in a class. You use the map to search and the queue to remove (first remove from the queue, grabbing the item, and then remove by key from the map).

Silver Phoenix
I'll look into it, but it seems too complex for the task at a first glance.
splintor
A: 

Another option might be to use boost::multi_index. It is designed to separate index from data and by that allowing multiple indexes on the same data.

I am not sure this really would be faster then to scan through 1000 items. It might use more memory then good. Or slow down search and/or insert/remove.

jpyllman
multi index will require removing and adding an item on each access, as each index key is constant, and accessStamp, which I should index by, changes on every access. So I guess I'll get no time benefit from this soluiton.
splintor
A: 

In my approach, it's needed to have a hash-table for lookup stored objects quickly and a linked-list for maintain the sequence of last used.

When an object are requested. 1) try to find a object from the hash table 2.yes) if found(the value have an pointer of the object in linked-list), move the object in linked-list to the top of the linked-list. 2.no) if not, remove last object from the linked-list and remove the data also from hash-table then put object into hash-table and top of linked-list.

For example Let's say we have a cache memory only for 3 objects.

The request sequence is 1 3 2 1 4.

1) Hash-table : [1] Linked-list : [1]

2) Hash-table : [1, 3] Linked-list : [3, 1]

3) Hash-table : [1,2,3] Linked-list : [2,3,1]

4) Hash-table : [1,2,3] Linked-list : [1,2,3]

5) Hash-table : [1,2,4] Linked-list : [4,1,2] => 3 out

Seungyoung