full-text-indexing

Full-text Indexing for a view with multiple databases

Can MS SQL support full-text indexing for a view that connects (joins or unions) multiple databases? ...

What is the best way to freshen a Nutch index?

I haven't looked at Nutch for a year or so and it looks like it has changed significantly. The documentation on re-crawling isn't clear. What is the best way to update an existing Nutch index? ...

List which columns have a full-text index in SQL Server 2005

How do I list all tables / columns in my database that have a full-text index? ...

How to identify if a Lucene.Net Index exists in a folder

I am using Lucene.Net for indexing and searching documents, and I am using the following code to create or open an index if one exists: IndexWriter writer = new IndexWriter(@"C:\index", new StandardAnalyzer(), !IndexExists); private bool IndexExists { get { return ?? } } now how can ...

Lucene.Net Best Practices

What are the best practices in using Lucene.Net? or where can I find a good lucene.net usage sample? ...

How to rebuild full-text index?

Got a requirement to rebuild mssql full-text index. Problem is - i need to know exactly when job is done. Therefore - just calling: ALTER FULLTEXT CATALOG fooCatalog REBUILD WITH ACCENT_SENSITIVITY = OFF doesn't work or I'm doing something slightly wrong. :/ Any ideas? ...

enterprise search engine development asking for advice

Hello everyone, I am asked to either deploy or develop an enterprise (intranet) search engine which could index all web pages of a couple of internal servers, and have a search portal to display all related content, like what Google is doing but for intranet. Any advice how to develop or deploy quickly? I have heard of Microsoft FAST p...

Faster search in Lucene - Is there a way to keep the whole index in RAM?

Is there a way of keeping the index in RAM instead of keeping it on the hard disk? We want to make searching faster. ...

Why does fulltext search in SQL Server works better with an English index for non english content?

I have the following query SELECT ct.Id,rankdegree FROM (SELECT c.[KEY] as ID ,c.[RANK] as rankdegree FROM CONTAINSTABLE( dbo.SearchItems,*,N'FORMSOF(INFLECTIONAL,хорошую)',1000) as c) as ct where rankdegree>0 Table SearchItems contains some Russian text. "хорошую" is a russian adjective which means good. So the problem is t...

Mobile phone (iPhone, Windows CE, Symbian, Android) search engines (text indexers)?

I'm looking for search engines that will run on one or more of the mobile platforms listed in the title. Something like Lucene (which 'should' run on Android) or minion. What are my alternatives on each platform? Have you made them run? What are the limitations you stumbled upon (cannot index more than 20 megs, for example)? ...

PostgreSQL : Gin max fields size

Hi' I'm currently evaluating many FullText indexing solutions, and I'm playing with native postgres FT. I'm trying to index my data using GIN indices. But there's a limitation in the field size, I encounter some errors saying "huge tuple" while inserting data As far as I understand, it's directly related to the field size. But this li...

SQL Free Text And Like

If I use like '%fish%' the following is returned AQUARIAN GOLDFISH FLAKES but if I use Contains([Description],' "fish*" ' ) it isn't is there something I can do? Basically I want to return anything that has the word fish in it anywhere. ...

Lucene.NET on shared hosting

I'm trying to get Lucene.NET to work on a shared hosting environment. Mascix over on codeproject outlines here how he got this to work on godaddy. I'm attempting this on isqsolutions. Both examples he posted run fine on my local machine and both throw the same error on the the shared hosting server: Compiler Error Message: CS0246: T...

SQL Full Text search on HTML/XML data

I have a sql full text catalog on a cms database (SQL 2005). The database holds the CMS page content within a ntext column which is part of the full text catalog. As expected the searching takes into account the xml tags within the page content so searching for "H1" returns all the pages with H1 tags. Is it possible to apply filters wi...

Word lists for a lot of articles - document-term matrix

I have nearly 150k articles in Turkish. I will use articles for natural language processing research. I want to store words and frequency of them per article after processing articles. I'm storing them in RDBS now. I have 3 tables: Articles -> article_id,text Words -> word_id, type, word Words-Article -> id, word_id, article_id, frequ...

What is the best Java text indexing library for Google App Engine?

To the moment I know that compass may handle this work. But indexing with compass looks pretty expensive. Is there any lighter alternatives? ...

Lemur gets malformed document error when trying to index file

I've been going through a bit of the lemur indexing tutorial here: http://www.lemurproject.org/tutorials/begin_indexing-1.php I've created a "corpus" folder, containing one document with the seemingly properly formatted file: <DOC> <DOCNO>1</DOCNO> <TEXT> Here is some text </TEXT> </DOC> and created the following configuration f...

When using of full text indexing in sql server give me better performance? In every where or in some situations?

I write a big application by using of NHibernate ORM. Is using of full text indexing in DB level has advantages for my application performance? does it give me better performance in searches? ...

What should the itemcount property in SQL Server match.

If the itemcount property does not exactly match the number of indexed rows is that a problem? Is there a numerical way I can ensure that I have a complete full-text-index? update: the property fulltextcatalogproperty('database','itemcount') does not equal the rowcount for the indexed tables. It is off by a few thousand. Does that indic...

Full Text Index type column is empty

I am trying to create an index on a VarBinary(max) field in my SQL Server 2008 database. The steps I am taking are as follows: Table: dbo.Records Right click on table and select "Full Text Index" Then select "Define Index..." I choose the primary key which is the PK of my table (field name Id, type UniqueIndentifier). I then get the ...