full-text-search

Retrieving the most common keywords from a tsvector column

I'm considering adding a tsvector column to an existing table that will hold possible search terms for multiple columns in that same table (e.g. the tsvector column will equal to_tsvector(header || ' ' || body || ' ' || footer)). Before I decide to do so, one of my requirements is that I am able to find the most popular keywords amongst...

Search filenames in MySQL database table restricted by filetype?

Hello I have a MySQL database that I replicate from another server. The database contains a table with this columns ID, FileName and FileSize In the table there are more than 4'000'000 records. I want to make fast a search in FileName (varchar) column I found that I can use for this Sphinx search engine. The problem is that I want to...

what are the fastest/Popular search technologies

What are the fastest search technologies apart from Relational DB Searches ? I have a collection of Text Files from varied sources (Banks/Ledgers/Stock Markets). Each line in these Text Files is a Record. Each line can further be parsed into some DB Columns(Stock Name/Date of purchase/Owner/...). It is not necessary that each line has a...

FTS: Searching across multiple fields 'intelligently'

Hi, I have a SP using FTS (Full Text Search). I want searches across multiple fields, 'intelligently' ranking results based on the weights I assign. Consider a search on a view fetching data from tables: Book, Author and Genre. Now, I want the searcher to be able to do: "Ludlum Fiction", "Robert Ludlum Bourne", "Bourne Ludlum", etc....

How do you boost term relevance in Sql Server Full Text Search like you can in Lucene?

I'm doing a typical full text search using containstable using 'ISABOUT(term1,term2,term3)' and although it supports term weighting that's not what I need. I need the ability to boost the relevancy of terms contained in certain portions of text. For example, it is customary for metatags or page title to be weighted differently than bod...

Fine tuning this mysql search-function in PHP...

I have a classifieds website, where users can search for items. The search should be in all fields with the name 'description' and 'headline'. I am currently using 'like' syntax (SELECT * FROM db WHERE description LIKE '%string%' OR headline LIKE '%string%') Problem is, if I have records with headlines like BMW it is enough to just typ...

Python SQLite FTS3 alternatives?

Are there any good alternatives to SQLite + FTS3 for python? I'm iterating over a series of text documents, and would like to categorize them according to some text queries. For example, I might want to know if a document mentions the words "rating" or "upgraded" within three words of "buy." The FTS3 syntax for this query is the followi...

Full Text Search: Noise words are being searched for

Hi, I have a database in SQL Server 2008 with Full Text Search indexes. I have defined the Stopword 'al' in the Stoplist. However, when I search for any phrase with the keyword 'al', the word 'al' is still uesd in ranking. This might be related to the fact that I am breaking up search terms, and reconstructing them. I am then searching...

How to format keywords in SQL Server Full Text Search

I have a sql function that accepts keywords and returns a full text search table. How do I format the keyword string when it contains multiple keywords? Do I need to splice the string and insert "AND"? (I am passing the keywords to the method through Linq TO SQL) Also, how do I best protect myself from sql injection here.? Are the de...

Lucene .NET 2.3.2 Security Exception - Medium trust Issues

I'm only partially able to get Lucene .NET to work on GoDaddy. It throws a security exception on this line: Hits hits = searcher.Search(query, filter); Here are the details of this exception: Description: The application attempted to perform an operation not allowed by the security policy. To grant this application the required perm...

Full Text Searching in Apple's Core Data Framework

I would like to implement a full text search in an iPhone application. I have data stored in an sqlite database that I access via the Core Data framework. Just using predicates and ORing a bunch of "contains[cd]" phrases for every search word and column does not work well at all. What have you done that seems to work well? ...

SQL Full Text search on HTML/XML data

I have a sql full text catalog on a cms database (SQL 2005). The database holds the CMS page content within a ntext column which is part of the full text catalog. As expected the searching takes into account the xml tags within the page content so searching for "H1" returns all the pages with H1 tags. Is it possible to apply filters wi...

Sql-server Full Text CONTAINS + COLLATE to ignore accents issues

Hi all im struggling a little here with using COLLATE to ignore accents whilst also using Contains full text. Ive reduced the columns im searching down to just one for the example here, and im hard coding the actual parameter just to simply this until i understand it. If i have SELECT Col1, Title COLLATE SQL_Latin1_General_...

Best Practices for implementing a Lucene Search in Java

Each document in my Lucene index is kind of similar to a post in stackoverflow and I am trying to search through the index (which contains millions of documents). Each user should only be able to search through the user's company posts only. I have no control over how the data is indexed and I only need to implement a simple search (tha...

Efficient query to lookup stuff in a word index

I have two tables defined like this: Page(id), Index(page_id, word) page_id in Index is a foreign key to Page so that each Page is connected to a group of Index entries. The Index table is a index for the Page table so that you can do fast text searching. E.g: SELECT page_id FROM Index where word = 'hello' Would select all page_id'...

Thinking Sphinx - Foreign key with different type - Association problem

Hello, I have two tables on mysql: users, and management. The users table has a numeric id, and the management table has a varchar foreign key which is the primary key of the other table. The types are not the same, and this seems to be the main problem when I build an index from the User model, and try to include one column from the ma...

Recommendation needed: Rails, Postgres and fuzzy full text search

I have Rails app with a Postgres backend. I need to add full text search which would allow fuzzy searches based on Levenshtein distance or other similar metrics. Add the fact that the lexer/stemmer has to work with non-English words (it would be ok to just switch language-dependent features off when lexing, to not mess with the target l...

which solrconfig.xml file is it?

i have just unzipped SOLR zip file downloaded from their website. and it says in the tutorial i have to edit the solrconfig.xml file. there are several in different locations. which one is it? and where should i have this root folder? inside my web space? LICENSE.txt README.txt client contrib docs lib CHANGES.txt NOTICE.txt b...

My webhosting company say it's impossible to change this MySql configuration... help

I am using FULLTEXT as index in a textfield in mysql. This so that I can do FULLTEXT searches later on. The default value of a FULLTEXT nr of characters is 4. This is changed in this line inside the my.cnf or my.ini file: [mysqld] ft_min_word_len=4 According to my webhosting company (one.com) they don't KNOW the value, AND it's IMP...

FULLTEXT searching wont suffice, need workaround, any ideas?

I can currently not use FULLTEXT indexes to search whole words because my server wont allow less characters than 4 in a search string. I need a workaround then from you guys. I want to be able to enter 'bmw 330' and then get results from mysql with whole matches like 'bmw 330'. Should I consider a third party search engine for this? ...