I have a postgresql database. In table, which i need to index, i have about 20 million rows. When i want to index them all in one attempt(smth like "select * from table_name"), i have Java OutOfMemory error, even, if i`ll give to JVM more memory.
Is there any option in SOLR to index a table part by part(e.g. execute sql for first 1000000 rows, then index it, then execute sql for second million)?
Now i am using sql query with LIMIT. But, everytime, when solr has indexed it, i need manually start it again.
UPDATE: Ok, 1.4 is out now. No OutOfMemory Exceptions, seems, Apache had done very big work on DIH. Also, now we can pass parameters through request, and use them in our SQL selects. Wow!