[BACKGROUND] We are currently trying to solve a performance problem. Which is searching for data and presenting it in a paginated way takes about 2-3 minutes.
Upon further investigation (and after several sql tuning), it seems that searching is slow just because of the sheer amount of data.
A possible solution that I'm currently investigating is to replicate the data in a searchable cache. Now this cache can be in the database (i.e. materialized view) or it could be outside the db (nosql approach). However, since I would like the cache to be horizontally scalable, I am leaning towards caching it outside the database.
I've created a proof of concept, and indeed, searching in my cache is faster than in the db. However, the initial full replication takes a long time to complete. Although the full replication will just happen once, and then succeeding replication will just be incremental against those that changed since the last replication, it would still be great if I can speed up the initial full replication.
However, during full replication, aside from the slowness of the query's execution, I also have to battle against network latency. In fact, I can deal with the slow query execution time. But the network latency is really really slowing the replication down.
[ACTUAL QUESTION] So which leads me to my question, how can I speed up my replication? Should I spawn several threads each one doing a query? Should I use a scrollable? .....or?