zend-search-lucene

Performance and bottle neck of Zend_Search_Lucene?

I've been using nutch for a while,untile recently that I know about this resort. How is its performance,and what's the file size limit it can support? Besides,how to delete or update an index instead of re-index each time there is a modification? ...

Zend Lucene fails all searches with special characters. Halp?

Hey, if anyone knows a simple answer to this, I don't have to wade through creating an extra index with escaped strings and crying my eyes out while littering my pretty code. Basically, the Lucene search we have running cannot handle any non-letter characters. Space, percent signs, dots, dashes, slashes, you name it. This is higly infur...

Creating and updating Zend_Search_Lucene indexes

I'm using Zend_Search_Lucene to create an index of articles to allow them to be searched on my website. Whenever a administrator updates/creates/deletes an article in the admin area, the index is rebuilt: $config = Zend_Registry::get("config"); $cache = $config->lucene->cache; $path = $cache . "/articles"; try { $index = Zend_Searc...

Zend Lucene misbehaving: Queries work one by one but not together.

Ok, so here's the deal: Lucene does the weirdes things to me. Everything is indexed properly, everything works, everything's fast etc etc. So I search for a category in English. Hundreds of results pop out. So I search for a country in English. Hundred of results pop out. So I search for a category AND a country in English. A combinat...

Zend_Search_Lucene query parsing problem

Here's the setup, I have a Lucene Index and it works well with the 2,000 documents I have indexed. I have been using Luke (Lucene Index Toolbox, v.0.9.2) to debug queries, and am using ZF 1.9. The layout for my Lucene Index is as follows: I = Indexed T = Tokenized S = Stored Fields: author - ITS category - ITS publication - ITS public...

Zend Lucene Index Merge

Just migrated my PHP web app to another server with a new db and now I'm trying to migrate Lucine's Index to new server. Is it even possible to move my index to another server? and can we access the search index (that is stored on one server say server A) from another server (say server B)? If yes then where can I can info about that? T...

Zend_Search_Lucene - How do I limit the results to a certain language?

Hi folks, I have indexed a website which is available in 14 languages, so far so good. Now I want to limit my lucene search to display only results in the visitor's language. Is there any (query)parameter or any option that I can set? Unfortunately I did not find anything. I am working with Zend_Search_Lucene if this should be relev...

Zend_Search_Lucene search in array

Is there a way to store an array as a document field and then query that array? I've got a collection of items, which are tagged. I'd like to be able to search all items that match for example tags 55 and 67. How would I achieve this? ...

Zend: index generation and the pros and cons of Zend_Search_Lucene

I've never came across an app/class like Zend Search Lucene before, as I've always queried my database. Zend_Search_Lucene operates with documents as atomic objects for indexing. A document is divided into named fields, and fields have content that can be searched. A document is represented by the Zend_Search_Lucene_Do...

Zend Lucene - tokenizing swedish chars

I use Zend Lucene to index swedish texts. The problem is that lucene tokenizes words at swedish chars åäö. For example the word "världens" becomes two words "v" and "ldens" in the index. Is there a way to add characters that zend lucene should accept and not tokenize at? ...

Zend Search with word "video" is null whereas "videos" is not, search part of words

Hi, I don't understand the way zend search lucene is working. The index returns nothing when i'm typing words in singular. Whereas when they are in plural, it matches. video = nothing. videos = it works. i tried different words and i don't arrive to search a part of words. Regards ...

Zend Lucene segment sizes

I'm getting errors on a Zend FW site (I'm not that familiar with), stating: [message:protected] => Largest supported segment size (for 32-bit mode) is 2Gb I found this after noticing the search function on this site stopped working. I'm not sure if "segment size" is the same as the index size, but the folder containing the 3 indexes ...

In Zend Lucene, how can I change the field which a query searches?

I am trying to create an "advanced search", where I can let the user search only specific fields of my index. For that, I'm using a boolean query: $sq1 = Zend_Search_Lucene_Search_QueryParser::parse($field1); // <- provided by user $sq2 = Zend_Search_Lucene_Search_QueryParser::parse($field2); // <- provided by user $query = new Zend_Se...

Wildcard Query in Zend Lucene

$index = Zend_Search_Lucene::open("/data/my_index1"); $doc = new Zend_Search_Lucene_Document(); $doc->addField(Zend_Search_Lucene_Field::Text('type','auto')); $index->addDocument($doc); $term = new Zend_Search_Lucene_Index_Term('auto*'); $query = new Zend_Search_Lucene_Search_Query_Wildcard($term); $hits = $index->find($query); f...

Lucene - Zend_Search_Lucene - how to build an index for "tagged"content

Hello all, I have following problem, I need to build lucene index for articles which are tagged. Here is simplified data structure and lucene proposal: article_id -> unindexed article_title -> UnStored article_content -> UnStored article_tags -> ????? (here is the problem) So article can have multiple tags. Lets say we have an artic...

Zend_Search_Lucene failing to return documents

Hi, I am struggling with a bug/problem that I am having trouble with when using Zend_Search_Lucene. Now I have 2 indexes that I search one that is parsed html pages/text that I use the Zend_Search_Lucene_Document_Html::loadHTML() function to read the contents and add to one of the lucene indexes. The other index I manually create a lu...

Zend_Search_Lucene View entire Cache

Is it possible to view the entire cache in a laid out clear view of what is indexed? ...

Zend_Search_Lucene and range search

I have a bunch of int key fields in my index and trying to do a simple range search like this: `gender:1 AND height:[120 TO 180]` This should give me male in the height range 120 to 180. But for some reason i get this exception: `At least one range query boundary term must be non-empty term` How would i debug this? Is it just Zend_...

Indexing large DB's with Lucene/PHP

Afternoon chaps, Trying to index a 1.7million row table with the Zend port of Lucene. On small tests of a few thousand rows its worked perfectly, but as soon as I try and up the rows to a few tens of thousands, it times out. Obviously, I could increase the time php allows the script to run, but seeing as 360 seconds gets me ~10,000 row...

Using Solr and Zends Lucene port together...

Afternoon chaps, After my adventures with Zend-Lucene-Search, and discovering it isn't all its cracked up to be when indexing large datasets, I've turned to Solr (thanks to Bill Karwin for that :) ) I've got Solr indexing the db far far quicker now, taking just over 8 minutes to index a table of just over 1.7million rows - which I'm v...