If one has a peer-to-peer system that can be queried, one would like to
- reduce the total number of queries across the network (by distributing "popular" items widely and "similar" items together)
- avoid excess storage at each node
- assure good availability to even moderately rare items in the face of client downtime, hardware failure, and users leaving (possibly detecting rare items for archivists/historians)
- avoid queries failing to find matches in the event of network partitions
Given these requirements:
- Are there any standard approaches? If not, is there any respected, but experimental, research? I'm familiar some with distribution schemes, but I haven't seen anything really address learning for robustness.
- Am I missing any obvious criteria?
- Is anybody interested in working on/solving this problem? (If so, I'm happy to open-source part of a very lame simulator I threw together this weekend, and generally offer unhelpful advice).
@cdv: I've now watched the video and it is very good, and although I don't feel it quite gets to a pluggable distribution strategy, it's definitely 90% of the way there. The questions, however, highlight useful differences with this approach that address some of my further concerns, and gives me some references to follow up on. Thus, I'm provisionally accepting your answer, although I consider the question open.