Hi All,
I'm working on an application that does processing at what I'd call fairly high throughput (current peaks in the range of 400 Mbps, design goal of eventual 10 Gbps).
I run multiple instances of a loop which basically just cycles through reading and processing information, and uses a dictionary for holding state. However, i also need to scan the entire dictionary periodically to check for timeouts, and I'd like to solicit some ideas on what to do if this scan becomes a performance hotspot. Basically, what I'm looking for, is if there are any standard techniques for interleaving the timeout checks on the dictionary, with the main processing code in the loop, so that say on loop 1 I check the first dictionary item, loop 2, the second, etc. Also, the dictionary keys change, and will be deleted and added in the main processing code, so it's not quite as simple as taking a copy of all the dictionary keys and then checking them one by one in the main loop.
I'll reiterate, this is not a current performance problem. Thus, Please no comments about premature optimizations, I realize it's premature, I am consciously making the choice to consider this a potential problem.
Edit for clarity: This is a curiosity for me that I'm thinking about it on my weekend, and what a best practices approach might be for something like this. This isn't the only problem I have, and not the only area of performance I'm looking at. However, this is one area where I'm not really aware of a clean concise way to approach this.
I'm already exploiting parallelism and hardware on this (the next level of hardware is a 5x increase in cost, but more significantly will require a redesign in the parallelism). The parallelism is also working the way I want it to, so again, please it isn't necessary to comment on this. The dictionary is instantiated per thread, so any additional threads for running the checks would require synchronization between the threads, which is too costly.
Some pseudo code of the logic if it helps:
Dictionary hashdb;
while(true) {
grab_record_from_buffer(); // There is a buffer in place, so some delays are tolerable
process(record); //do the main processing
update_hashdb(); //Add,remove,update entries in the dictionary
if(last_scan > 15 seconds)
foreach(entry in hashdb)
periodic_check(entry); //check for timeouts or any other periodic checks on every db entry
}
I do realize I may not run into an actual problem with the way I have it, so there's a good chance whatever comes up may not require use. However, what I'm really looking for is if there is any standard approach or algorithm for interleaving a dictionary scan with main processing logic, that I'm just not aware of (and the dictionary is changing). Or any suggestions on an approach to this (I do already have an idea how I would approach it, but it's not as clean as I'd like).
Thank You,