ansaurus

Question

Is LTRIM(RTRIM(COALESCE(TextField,''))) Bad?

Answer 1

+2 A:

I think in SQL Server you can match a padded string with a non-padded string, saving yourself the effort on the LTRIM/RTRIM, however I'm not 100% on that...

However, tidying data is all part of ETL and needs to be done before your data gets to where it is going. You may find on large datasets that it is quicker to create a temporary copy of the data, reprocess it, index it, then do the required matching.

ck 2009-04-20 16:11:50

Answer 2

+4 A:

It is bad because your JOIN will have to scan the whole index, this is because your condition is not sargable

Are you also sure is is a TEXT datatype? Last I checked you could not use LTRIM or RTRIM against a Text datatype column?

in response to the char against varchar comment, run this

declare @v varchar(50),@v2 char(50)
select @v ='a',@v2 = 'a'

select datalength(@v),datalength(@v2)

SQLMenace 2009-04-20 16:12:27

Any recommendations on how to improve it?

John Dibling 2009-04-20 16:13:27

Regarding your question, it is specifically char(50). Sorry I was imprecise, and I will edit my OP.

John Dibling 2009-04-20 16:15:53

if you have leading spaces then you probably can't improve it. You should have a constraint on the table that does not allow leading spaces that way you don't have to worry about this later

SQLMenace 2009-04-20 16:16:28

you should have used varchar(50) not char(50) unless you always have 50 chars but since you are doing RTRIM I doubt it

SQLMenace 2009-04-20 16:18:16

Why varchar vs char? Does it take less space, or some other reason?

John Dibling 2009-04-20 16:20:53

I would say clean the data first, add the constraint and then you can do a regular join

SQLMenace 2009-04-20 16:20:55

run the additional code I posted and you will see that char uses all 50 bytes

SQLMenace 2009-04-20 16:23:34

Answer 3

+2 A:

I believe that SQLMenace is correct.

How about adding an INSERT/UPDATE trigger to the table to guarantee that there is no whitespace on that column?

IF the column is VARCHAR, SQL Server will automatically ignore ending whitespace. Leading whitespace still counts though.

Actually, wouldn't SQL Server automatically pad both columns to CHAR(50) before doing the JOIN? (Implict conversions.)

beach 2009-04-20 16:16:50

Answer 4

+6 A:

You need to:

Create a computed column on masterTable using expression LTRIM(RTRIM(COALESCE(TextField,'')))
Build an index on this column and
Use this column in a join.

The way your table is designed now it's quite impossible to make this query index-friendly.

If you cannot change your table structure but can estimate the number of LEADING spaces, you may use an approach described here.

This solution, however, is far not as efficient as creating an index on a computed column.

Quassnoi 2009-04-20 16:16:53

or clean the data and add a constraint which will disallow trailing and leading spaces

SQLMenace 2009-04-20 16:19:10

Yes, though it may be a pain on an running database.

Quassnoi 2009-04-20 16:22:27

Answer 5

+3 A:

I would recommend changing that datatype to VARCHAR(50) - up to about 10 characters, the CHAR(x) might make sense since it's a tad faster and has less overhead - but with 50 characters, unless all the columns are fully used up to 50 chars, this is a major overhead, especially since this column is also used in indices.

Changing it to VARCHAR(50) would possibly quite significantly reduce the space needed for the table (depending on your amount of data and how much of the 50 chars are really used), and all the indices involved would also get a lot smaller - plus you wouldn't need this COALESCE, LTRIM, RTRIM stuff anymore :-)

Marc

marc_s 2009-04-20 16:38:37

Answer 6

+2 A:

If you have to trim the data everytime you use it, this should NOT be a char datatype but a varchar datatype. Anytime you have to use a function everytime you query a field, something is wrong with your database design.

Youmight find this discussion helpful: http://stackoverflow.com/questions/758699/is-the-char-datatype-in-sql-obsolete-when-do-you-use-it/760511#760511

HLGEM 2009-04-20 17:07:09

You'll get no argument from me that there is something wrong with the db's design!

John Dibling 2009-04-20 17:38:22

ansaurus

tags:

views:

answers:

Is LTRIM(RTRIM(COALESCE(TextField,''))) Bad?

related questions