views:

52

answers:

1

I have a database table in Sql Server 2008 R2 which contains data stored in multiple languages including English, Swedish, Hungarian and German.

The table uses the Latin1_General_CI_AS collation. The full text catalog has the table assigned to it with an index on the multi-language column.

I have two problems:

  • In the catalog properties, a language has to be specified for word breaks. This is currently set to English. How do I get it to use multiple languages for word breaks?
  • Hungarian is not even available in the list of languages that can be selected for word breaks. How do I configure the full text search to search Hungarian text?
+1  A: 

According to Microsoft, sys.fulltext_languages, Hungarian is not a supported language for Full Text Search.

The full list of supported languages is at http://msdn.microsoft.com/en-us/library/ms176076.aspx

It also appears that you are going to have to choose one language or another:

http://blogs.msdn.com/b/sqlcat/archive/2008/11/06/best-practices-for-integrated-full-text-search-ifts-in-sql-2008.aspx

Handling multiple languages in a single document is a hard problem. Which word breaker do you use to shred the original document, and which language are you going to specify for the query? For example, if you have a document with Korean and English and you use the Korean word breaker to process the document, then if you search the document for English words it will only find the exact words and not any other forms of the words (like ing and s).

NinjaCat