views:

206

answers:

5

I'm developing an app in Rails on OS X using PostgreSQL 8.4. I need to setup the database for the app so that standard text queries are case-insensitive. For example:

SELECT * FROM documents WHERE title = 'incredible document'

should return the same result as:

SELECT * FROM documents WHERE title = 'Incredible Document'

Just to be clear, I don't want to use:

(1) LIKE in the where clause or any other type of special comparison operators

(2) citext for the column datatype or any other special column index

(3) any type of full-text software like Sphinx

What I do want is to set the database locale to support case-insensitive text comparison. I'm on Mac OS X (10.5 Leopard) and have already tried setting the Encoding to "LATIN1", with the Collation and Ctype both set to "en_US.ISO8859-1". No success so far.

Any help or suggestions are greatly appreciated.

Thanks!

Update

I have marked one of the answers given as the correct answer out of respect for the folks who responded. However, I've chosen to solve this issue differently than suggested. After further review of the application, there are only a few instances where I need case-insensitive comparison against a database field, so I'll be creating shadow database fields for the ones I need to compare case-insensitively. For example, name and name_lower. I believe I came across this solution on the web somewhere. Hopefully PostgreSQL will allow similar collation options to what SQL Server provides in the future (i.e. DOCI).

Special thanks to all who responded.

A: 

Your problem and your exclusives are like saying "I want to swim, but I don't want to have to move my arms.".

You will drown trying.

Erwin Smout
No, I come from a SQL Server background where case-insensitive text comparison is the norm.
Kevin Rood
SQL Server is broken, then.
alvherre
A: 

I don't think that is what local or encoding is used for. Encoding is more for picking a character set and not determining how to deal with characters. If there were a setting it would be in the config, but I haven't seen one.

If you do not want to use ilike for fear of not being able to port to another database then I would suggest you look into what ORM options might be available with ActiveRecord if you are using that.

here is something from one of the top postgres guys: http://archives.postgresql.org/pgsql-php/2003-05/msg00045.php

edit: fixed specific references to locale.

Arthur Thomas
and I was downvoted why? If I said something incorrectly I would love to know what it was.
Arthur Thomas
@Arthur while I didn't downvote your answer, I would point out that locale and character sets are related, but different.
Kevin Rood
ah, I confused myself when responding. I was thinking encoding in my head. thanks for pointing that out. Locale would definitely not be used for that. Encoding would only really 'work' if there happened to be a set characters were there were not uppercase or something hehe. If there -were- a configurable item like that it would be in postgresql's config file.
Arthur Thomas
I have been very surprised that PostgreSQL doesn't easily support this as a standard option. Evidently MySQL handles string equality comparisons in a case-insensitive way by default (based on my limited research). SQL Server allows you to pick a collation that is "dictionary ordered, case-insensitive" (DOCI), which I believe controls string equality comparisons. If so, that is why Microsoft SQL Server handles this: SELECT * FROM documents WHERE title = 'incredible document' the same as SELECT * FROM documents WHERE title = 'Incredible Document'.
Kevin Rood
well in 8.4 I believe postgresql has included Full Text Search which does full text searching. It handles searching through text in a more robust way than basic sql, but I have no used that yet. http://www.postgresql.org/docs/8.4/static/textsearch-intro.html It is built into postgres with that release so I wanted to mention it.
Arthur Thomas
+2  A: 

You will likely need to do something like use a column function to convert your text e.g. convert to uppercase - an example :

SELECT * FROM documents WHERE upper(title) = upper('incredible document')

Note that this may mess up performance that used index scanning, but if it becomes a problem you can define an index including column functions on target columns e.g.

CREATE INDEX I1 on documents (upper(title))
Steve De Caux
as the other guys have said, doing this by characterset or locale is not really a reasonable solution
Steve De Caux
Why would this not be reasonable?
Kevin Rood
@Kevin, a charset that did not distinguish between upper and lower case would logically store the characters as case-neutral glyphs. There is no charset for the Latin alphabet in common use that does this. The available locales are simply a shorthand for choosing a default character encoding ie charset.
Steve De Caux
A: 

With all the limitations you have set, possibly the only way to make it work is to define your own = operator for text. It is very likely that it will create other problems, such as creating broken indexes. Other than that, your best bet seems to be to use the citext datatype; that would still let the ORM stuff you're using generate the SQL.

(I am not mentioning the possibility of creating your own locale definition because I haven't ever heard of anyone doing it.)

alvherre
A: 

SELECT * FROM documents WHERE title ~* 'incredible document'

hope this will solves your problem.