tags:

views:

181

answers:

1

What is index hashing ? What are its advantages over regular hashing techniques ?

A: 

Hello

Index Hashing

Searchable content is mapped to the search engine using Compass different mapping definitions (OSEM/XSEM/RSEM). Compass provides the ability to partition the searchable content into different sub indexes, as shown in the next diagram:

Sub Index Hashing

http://www.opensymphony.com/compass/versions/1.1M1/html/images/subindex-hash.png

In the above diagram A, B, C, and D represent aliases which in turn stands for the mapping definitions of the searchable content. A1, B2, and so on, are actual instances of the mentioned searchable content. The diagram shows the different options of mapping searchable content into different sub indexes. Constant Sub Index Hashing

The simplest way to map aliases (stands for the mapping definitions of a searchable content) is by mapping all its searchable content instances into the same sub index. Defining how searchable content mapping to the search engine (OSEM/XSEM/RSEM) is done within the respectable mapping definitions. There are two ways to define a constant mapping to a sub index, the first one (which is simpler) is:

<compass-core-mapping>
  <[mapping] alias="test-alias" sub-index="test-subindex">
    <!-- ... -->
  </[mapping]>
</compass-core-mapping>

The mentioned [mapping] that is represented by the alias test-alias will map all its instances to test-subindex. Note, if sub-index is not defined, it will default to the alias value.

Another option, which probably will not be used to define constant sub index hashing, but shown here for completeness, is by specifying the constant implementation of SubIndexHash within the mapping definition (explained in details later in this section):

<compass-core-mapping>
  <[mapping] alias="test-alias">
    <sub-index-hash type="org.compass.core.engine.subindex.ConstantSubIndexHash">
        <setting name="subIndex" value="test-subindex" />
    </sub-index-hash>
    <!-- ... -->
  </[mapping]>
</compass-core-mapping>

Modulo Sub Index Hashing

Constant sub index hashing allows to map an alias (and all its searchable instances it represents) into the same sub index. The modulo sub index hashing allows for partitioning an alias into several sub indexes. The partitioning is done by hashing the alias value with all the string values of the searchable content ids, and then using the modulo operation against a specified size. It also allows setting a constant prefix for the generated sub index value. This is shown in the following diagram:

Modulo Sub Index Hashing

Here, A1, A2 and A3 represent different instances of alias A (let it be a mapped Java class in OSEM, a Resource in RSEM, or an XmlObject in XSEM), with a single id mapping with the value of 1, 2, and 3. A modulo hashing is configured with a prefix of test, and a size of 2. This resulted in the creation of 2 sub indexes, called test_0 and test_1. Based on the hashing function (the alias String hash code and the different ids string hash code), instances of A will be directed to their respective sub index. Here is how A alias would be configured:

<[mapping] alias="A">

Naturally, more than one mapping definition can map to the same sub indexes using the same modulo configuration:

Complex Modulo Sub Index Hashing

Custom Sub Index Hashing

ConstantSubIndexHash and ModuloSubIndexHash are implementation of Compass SubIndexHash interface that comes built in with Compass. Naturally, a custom implementation of the SubIndexHash interface can be configured in the mapping definition.

An implementation of SubIndexHash must provide two operations. The first, getSubIndexes, must return all the possible sub indexes the sub index hash implementation can produce. The second, mapSubIndex(String alias, Property[] ids) uses the provided aliases and ids in order to compute the given sub index. If the sub index hash implementation also implements the CompassConfigurable interface, different settings can be injected to it. Here is an example of a mapping definition with custom sub index hash implementation:

<compass-core-mapping>
  <[mapping] alias="A">
    <sub-index-hash type="eg.MySubIndexHash">
        <setting name="param1" value="value1" />
        <setting name="param2" value="value2" />
    </sub-index-hash>
    <!-- ... -->
  </[mapping]>
</compass-core-mapping>

Source :http://www.opensymphony.com/compass/versions/1.1M1/html/core-searchengine.html

Mike Redford
Not only is your answer likely to be a violation of copyright (the exception for quoting in most countries' laws require that the quote is not the entire text where it is placed), the source is also much more descriptive as it has illustrations. I get the feeling that you just googled for index hashing and copied and pasted what you found without really understanding the topic.
Fredrik
Lets not just upmod this comment, let's downmod/report the answer then
George Jempty