I have a project to create a high traffic search engine similar to altavista.com. The windows .NET C# will be used for the project. I am looking for a good search engine database that can handle a very high load. I have taken a look at lucene and sql server 2008. I have read that lucene tends to get corrupt when the load is very high. So...
I am trying to figure out if how I can accomplish the following and none of the answers I have found so far seem to fit:
I have a fairly static and large set of resources I need to have indexed and searchable. Solr seems to be a perfect fit for that. In addition I need to have the ability for my users to add resources from the main data...
Hello
I would like to store data retrieved hourly from RSS feeds in a database or in Lucene so that the text can be easily indexed for wordcounts.
I need to get the text from the title and description elements of RSS items.
Ideally, for each hourly retrieval from a given feed, I would add a row to a table in a dataset made up of the f...
Hi,
I want to know if it is possible to use the same index file for an entity in two applications. Let me be more specific:
We have an online Application with a frondend for the users and an application for the backend tasks (= administrator interface). Both are running on the same JBOSS AS. Both Applications are using the same databas...
Hi,
I have two lucene indexes and i need to search on the two indexes. How can i execute a search in multiple lucene indexes? How can i sort these results?
Thanks,
Luiz Costa
...
Hello comunity! ;)
I'm building a web search application (rich application) that is intended to search over some historical documents. Those documents have their own structure. I'm using lucene 3.x to build the search engine, etc.
So far i have built my own Analyzer and a SimpleToken class to fit my needs. So what is the problem?
The ...
I'm working on a system that performs matching on large sets of records based on strings and numeric ranges, and date ranges. The String matches are mostly exact matches as far as I can tell, as opposed to less exact full text search type results that I understand lucene is generally designed for. Numeric precision is important as the da...
I am building a faceted search with Lucene.NET, not using Solr. I want to get a list of navigation items within the current query. I just want to make sure I'm pointed in the right direction. I've got an idea in mind that will work, but I'm not sure if it's the right way to do this.
My plan at the moment is to create hiarchry of all ava...
I have a lucene index that i build and update using raw lucene indexers. I was wondering if there is a way to force solr to re-read the index without restarting the solr instance. Ive tried the update?commit=true but it doesnt seem to matter. The only way i can be sure solr -re-reads the index is by a total restart, which of course is...
I've got an ASP.NET site backed with a SQL Server database. I'm been using Lucene.NET to index and search the database. I'm adding faceted search navigation to the results page (the facets are a hiarchical category tree). I asked yesterday to make sure I was using the right technique for faceting. All I've gotten so far is a suggestion t...
Im am interesting in the possibilities of facetted searching using Lucene and perhaps Bobo but have a few q regarding just how practical it is for the user if they are just searching text and not data that has been broken up into many fields each which could be the target of a facet and tallying.
...
Hola guys!
I could not find any info on the web and stackoverflow on how to get the first matching character subsequence from a Lucene Document.
ATM i'm using this logic to retrieve results from Lucene:
Document doc=searcher.doc(hit.doc);
String text=doc.get("text");
if (text.length() > 80){
text=te...
I am seeing extremely slow Solr updates in my database. The database only has 900 documents. We use autocommit with the following settings, and once in a while autocommit is taking long time blocking updates:
<autoCommit>
<maxDocs>10000</maxDocs>
<maxTime>1000</maxTime>
</autoCommit>
What in the world can be happening for 74 sec...
I would like to use the Dismax query parser because it allows me to specify multiple default search fields (using the 'qf' parameter) as well as other nice features such as field boosting.
However, I want a query parser/scoring algorithm that takes the sum of all field scores, rather than just the max.
Is there a way to configure D...
Hi,
I am trying to work out how to improve the scoring of solr search results. My application needs to take the score from the solr results and display a number of “stars” depending on how good the result(s) are to the query. 5 Stars = almost/exact down to 0 stars meaning not matching the search very well, e.g. only one element hits. ...
How do I sort my results in a random order. my code looks something like this at the moment:
Dim searcher As IndexSearcher = New IndexSearcher(dir, True)
Dim collector As TopScoreDocCollector = TopScoreDocCollector.create(100, True)
searcher.Search(query, collector)
Dim hits() As ScoreDoc = collector.TopDocs.scoreDocs
For Each sDoc As ...
Let's say I have documents with two fields, A and B.
I'd like to use SOLR's MoreLikeThis, but with a twist: I'm most interested in bossting documents whose A field is like my model document's B field.
I don't see a way to use the mlt.fl fields or mlt.qf boosts to achieve this effect in a single query. Am I missing some option?
Or wi...
{Zend_Search_Lucene_Exception} Index is under processing now
This is what i get in my error log which makes crash the 50% of my website.
What should i do to fix that please ?
Thanks
...
I would like to build an internal search engine (I have a very large collection of thousands of XML files) that is able to map queries to concepts. For example, if I search for "big cats", I would want highly ranked results to return documents with "large cats" as well. But I may also be interested in having it return "huge animals", a...
We are maintaining a Lucene index which contains around 20mm documents. The nature of the search queries is such that indexing and quering can be easily split between different indexes.
To achive that we need to keep many (potentially thousands) of IndexWriters or IndexReaders/Searchers in memory to deal with indexing and quering of eac...