full-text-search

What are the fastest full-text search algorithms/APIs (open source or commercial)?

Are there any silver bullets out there for searching medium sized amounts of text data (hundreds of gigabytes)? Don't really care if it's commercial or open source. I should add that I need it to be C++ or C based. ...

Why can't I create this sql server full text index?

I have a database table with the primary column defined as: ID bigint identity primary key I also have a text column MiddlePart. I'm trying to create a full text index, like so: CREATE FULLTEXT INDEX ON domaining.dbo.DomainName ( MiddlePart Language 0X0 ) KEY INDEX ID ON domaincatalog WITH CHANGE_TRACKING AUTO I get this e...

Which index does each match belong to?

I have set up Sphinx to index three tables in a MySQL database, each to its own index. The problem I'm having is that it doesn't return which index each match belongs to, so unless I'm searching an individual index, the results are fairly useless. The search app included with Sphinx displays the index along with the matches, is there a...

SQL Server Full text index size growing

I got a big table with around 10 million records. The table is part of full text index and the data contained in it is recreated daily. Now after recreating the data I rebuild the full text index using SQL: ALTER FULLTEXT INDEX ON [table1] START FULL POPULATION; The issue is that the folder containing full text index files size is gro...

SQL Server 2008 Full text search on a table with a composite primary key

Hi everyone! I am trying to put full text search working on SQL Server 2008, however the table i am trying to index is a table with a composite primary key, something like this: EXEC sp_fulltext_catalog 'My_Catalog', 'create' EXEC sp_fulltext_table 'Message', 'create', 'My_Catalog', 'PK__MESSAGES__C87C0C9C0EC32C7A' // PK__MESSAGES__C87...

Do you know of any search algorithms for searching through source code?

Hello everybody. My question regards the existence of a search algorithm for searching source code. In my project, I will have to implement an application that will search through a repository of source code (through a lot of source code files). All the files are from previous projects done within the company. We think that implementing...

Database design for Wave-like collaboration system

How do we decide what the smallest unit is? For text collaboration should it be a word, a paragraph? Is there going to be performance issues if the unit is too small? But it might be more flexible ...

TSQL: query with optional join

Hi I have a search ui with 3 all optional search criteria. 2 of them are simple criteria for a where statement, that I should be able to solve with this: http://stackoverflow.com/questions/697671/stored-procedure-with-optional-where-parameters. The last criterion is using full text search where I join the result from ContainsTable. Is ...

MySQL queries and text search

I have this query: select name, body from news where body like %MyWord%; I use MySQL database engine. this query will return name, body when found MyWord in body text. My problem here is that, when I search about two word in body text, like MyWord1 , MyWord2. or more !!! How I can do that if you know that this query is calling by...

SQL Server FTS: Ranking is a bit strange

I am using SQL Server 2008's Full Text Search engine in my website. I have a search SP, which shows results sorted based on ranking. I break up the search string and pass it to the FTS query engine like so (search string is 'test search': ("*test*" ~ "*search*") OR ("*test*" OR "*search*"). If the results row has the row 'test search...

Tips on how to improve full text search for search engine

I'm developing: http://www.buscatiendas.com.mx I've seen people entering text for queries with lots of typos. What kind of search could i implement so similar words are found? Like google does more or less would be neat. I'm using SQL Server Full Text search. ...

Factors governing the searching software / algorithm

Hello, Which factors govern the text searching technology ? 1. Size of columns 2. Number of records 3. Record navigation ie; moving from one record to another to include results 4. Anything else I believe its only number of records. ...

How to write a php search script in which words with diacritics match search terms without diacritics, and the results are underlined?

Hi all! I've got this site where there are lots of texts with diacritics in them (ancillary glyphs added to letters, according to wikipedia) and most people search these texts using words without the glyphs. Now it shouldn't be challenging to do this by having a copy of the texts without diacritics. However, I want to highlight the matc...

sql server compatibility full text stoplists and noise words

if I am running sql server 2008 in compatibility level 90 (sql 2005) does it use the stoplist in the resource database or does it use ftdata\ENU.txt files??? also if I make my own stoplist in 2008 (using compatibility 100) can I ignore the system stoplist and use my own with my full text queries or will it use the system and my custom s...

Improve my Search engine

Hello all! I have tried to implement a simple search engine for my application. This is what I have done: CREATE FULLTEXT INDEX item_name_other_name_desc_index ON item (name,other_name,description) public static function Search($string) { $delims = ',.; '; $word = strtok($string,$delims); while($word) ...

E-commerce Search Engine

Our current search engine for our shop sucks. It's basically a simple mysql fulltext search. It's not always relevant, ranking is not very useful, and no filtering, auto-correct, etc ... Now I know we could spend more time and money and make it good, but I feel I'd be re-inventing the wheel. I was looking at Google Commerce Search (htt...

MySQL: how to make multiple table fulltext search

Hello, I want to integrate the MySQL fulltext search function in my PHP site. I have the following problem now. SELECT * FROM testtable t1, testtable2 t2 WHERE MATCH ( t1.firstName, t1.lastName, t1.details, t2.firstName, t2.lastName, t2.details ) AGAINST ( 'founder' ); And i have the error code: #1210 - Incorrect arguments to MATCH...

Why is my query so slow? (SQL Server 2008 full text search weirdness)

I have a table with a full-text indexed column MiddlePart. The table has around 600,000 rows. The following query is very fast (30 results, <1 second): select * from DomainName where contains (MiddlePart, '"antiques*"') OR freetext(MiddlePart, 'antiques') This query is also very fast (5 results, <1 second): select * from DomainNa...

Search engine solution for Django that actually works?

The story so far: Decided to go with Xapian as search backend because it has all search-engine features I was looking for, knows about Unicode, stemming, has few dependencies and requires no bloated app-server installation on top of it. Tried Django and Haystack (plus xapian-haystack, the backend glue code to tie Haystack to Xapian) be...

What's wrong with my fulltext search query?

I'm have some trouble with the fulltext CONTAINS operator. Here's a quick script to show what I'm doing. Note that the WAITFOR line simply gives the fulltext index a moment to finish filling up. create table test1 ( id int constraint pk primary key, string nvarchar(100) not null ); insert into test1 values (1, 'dog') insert into test1 v...