views:

23

answers:

2

i have a table [Company] with a column [Address3] defined as varchar(50) i can not control the values entered into that table - but i need to extract the values without leading and trailing spaces. i perform the following query:

SELECT DISTINCT RTRIM(LTRIM([Address3])) Address3 FROM [Company] ORDER BY Address3

the column contain both rtl and ltr values most of the data retrieved is retrieved correctly - but SOME (not all) RTL values are returned with leading and or trailing spaces

i attempted to perform the following query:

SELECT DISTINCT ltrim(rTRIM(ltrim(rTRIM([Address3])))) c, ltrim(rTRIM([Address3])) b, [Address3] a, rtrim(LTRIM([Address3])) Address3 FROM [Company] ORDER BY Address3

but it returned the same problem on all columns - anyone has any idea what could cause it?

+1  A: 

The rows that return with extraneous spaces might have a kind of space or invisible character the trim functions don't know about. The documentation doesn't even mention what is considered "a blank" (pretty damn sloppy if you ask me). Try taking one of those rows and looking at the characters one by one to see what character they are.

Matti Virkkunen
you are right - i converted the text to hex using:SELECT DISTINCT master.dbo.fn_varbintohexstr(cast(ltrim(rtrim([Address3])) as varbinary)) e, master.dbo.fn_varbintohexstr(cast([Address3] as varbinary)) dat the start of my query - and found out that i have 0x0d0a inside - thanks
Lee Elenbaas
A: 

since you are using varchar, just do this to get the ascii code of all the bad characters

--identify the bad character
SELECT 
    COUNT(*) AS CountOf
        ,'>'+RIGHT(LTRIM(RTRIM(Address3)),1)+'<' AS LastChar_Display
        ,ASCII(RIGHT(LTRIM(RTRIM(Address3)),1))  AS LastChar_ASCII
    FROM Company 
    GROUP BY RIGHT(LTRIM(RTRIM(Address3)),1)
    ORDER BY 3 ASC

do a one time fix to data to remove the bogus character, where xxxx is the ASCII value identified in the previous select:

--only one bad character found in previous query
UPDATE Company
    SET Address3=REPLACE(Address3,CHAR(xxxx),'')

--multiple different bad characters found by previous query
UPDATE Company
    SET Address3=REPLACE(REPLACE(Address3,CHAR(xxxx1),''),char(xxxx2),'')

if you have bogus chars in your data remove them from the data and not each time you select the data. you WILL have to add this REPLACE logic to all INSERTS and UPDATES on this column, to keep any new data from having the bogus characters.

If you can't alter the data, you can just select it this way:

SELECT
    LTRIM(RTRIM(REPLACE(Address3,CHAR(xxxx),'')))
    ,LTRIM(RTRIM(REPLACE(REPLACE(Address3,CHAR(xxxx1),''),char(xxxx2),'')))
...
KM
i did used the replace in the solution - and replaced the problematic characters before apply the ltrim(rtrim comboi could not fix the data since i have no control over how data is being entered to the table - so the same characters will continue to pop in every once in a while even if i will fix the data - instead i am better off fixing it in my selects on the fly
Lee Elenbaas
you are never `better off fixing it in my selects on the fly`, I would use a one time UPDATE and a trigger to keep the data pure before I'd fix it every time a select runs. how many times do you want to fix the same data over and over again, wasting resources every time, forever?
KM
The problem is that the data isn't mine - it belongs to a separate application maintained by that separate application and i can not FIX it since it is possible that for them such characters are legitimatethe only way i can FIX it correctly is to add a calculated column or add a separate table and keep it up to date with any insert/update/delete done on the original tableboth option i do not like
Lee Elenbaas