views:

170

answers:

1

I'm deciding how to split 3 large sphinx indexes between 3 servers. Each of the 3 indexes is searched separately.

What's more effective in terms of performance (speed of search):

  1. to host each index on separate machine

Example

machine1 - index1
machine2 - index2
machine3 - index3
  1. or to split each index into 3 parts and host each part of the same index on separate machine.

Example

machine1 - index1_chunk1,  index2_chunk1, index3_chunk1
machine2 - index1_chunk2,  index2_chunk2, index3_chunk2
machine3 - index1_chunk3,  index2_chunk3, index3_chunk3

?

A: 

Intuitively, I would say the first option would be more efficient.

In this scenario, when you search on any of the indexes - say, index1 - the system just looks up the machine ID it is hosted on, and searches on it, returning the resultset.

In your second scenario, for each index to be searched, the machine would have to distribute its search across three machines, keeping in memory different machine IDs, and where to find each chunk, and then finally collate the resultset before giving you results.

viksit
yes, but in the second scenario the system would be able to search the same index in parallel and this may be good for performance. There will few chunks, so it's not a big problem to find and keep in memory the table of chunks.
Andriy Bohdan