I'm now working on a software in mobile platform where memory is very small. In a I/O bottleneck function, I need read some bytes from a img file using seek operation(You can assume that seek is slower about 10 times than read directly from memmry). In my test, this function is called 7480325 times and read bytes from bytes_offset 6800 to 130000, so every byte is read 100 times on average(some bytes are read 3~4 times, some more than 1000 times).
Below is my statistics.
bytes offset 6800 ~ 6900: 170884 times
bytes offset 6900 ~ 7000: 220944 times
bytes offset 7000 ~ 7100: 24216 times
bytes offset 7100 ~ 7200: 9576 times
bytes offset 7200 ~ 7300: 14813 times
bytes offset 7300 ~ 7400: 22109 times
bytes offset 7400 ~ 7500: 19748 times
bytes offset 7500 ~ 7600: 43110 times
bytes offset 7600 ~ 7700: 157976 times
...
bytes offset 121200 ~ 121300: 1514 times
bytes offset 121300 ~ 121400: 802 times
bytes offset 121400 ~ 121500: 606 times
bytes offset 121500 ~ 121600: 444 times
bytes offset 121600 ~ 121700: 398 times
max_bytes_offset 121703
min_bytes_offset 6848
Then I want to build a cache using LRU schema to make the I/O performance better. In some others' question, I find hashtable + doubly-linked list is a good way. But how to make a structure to improve my issue in the best way? I plan to build 1300 buckets, and every bucket own a doubly-linked list with max size 10. Then total memory it takes is about 13KB. It is simple to implemented and maintain, but I think the efficiency is not the best.
In my statistics, some bytes offset interval have more hits ratio while some interval have less. How can I build a structure to adapt my statistics?
And when I search for a key, I need traversal the whole list with size 10. Is there any other method with higher efficiency in searching?
In some mobile platform more memory can be used for cache while other platform allows less. How can I make my cache to adapt the allowed memory changes, except changing the size of buckets?
It seems that caf's method is better. Using a big doubly-linked list and a big hash table mapping keys to node entries makes more sense and takes more advantage of LRU. But designing hash function is becoming a hard problem.
I'm waiting for your suggestion, thank you~