tags:

views:

52

answers:

2

Hi,
My question is pretty similar to this question
The difference, I'd need the least RAM intensive way to gather information about the distinct values. I DON'T care for the actual count in this case, I just want to know the possible values for that field.
I'm constantly running out of heap space (30 million+ documents) and there has to be some way/parameter to do this in a memory saving way

A: 

I don't know about RAM usage, but you might wanna try Field collapsing You will find the patch for Solr here.

Jem
That seems to be only relevant for the result set. I don't let solr return any rows. I'm only interested in the facet fields
Marc Seeger
A: 

If the number of distinct values is high, you will probably need to do facet paging. Use the facet.offset and facet.limit parameters.

Pascal Dimassimo
what would be "high"? The top field probably has 100 possible values
Marc Seeger
The default is 100, so it is usually not considered "high". But try with facet.limit=10 and see how it goes.
Pascal Dimassimo