views:

40

answers:

4

I have a varchar column in one of my tables with data like:

1234abc
1234abcde456757
1234abc Supervisor
1234abc456 Administrator

I want to "clean it" by removing any letters and numbers immediately following them so for the above examples I want to have:

1234
1234
1234 Supervisor
1234 Administrator

In another word, I want to keep the initial number and the last word. I'm using the SUBSTRING and CHARINDEX but those functions remove everything till the end of the string and I don't know the length of the part I need to remove.

Any suggestions?

Thanks

A: 

You actually want two strings, the characters at indices 0-3 and those from the position after the space till the end of the string. I (think) this will work (have not tried it):

UPDATE TableName SET ColumnName = SUBSTRING(ColumnName,1,4) + 
    SUBSTRING(ColumnName,CHARINDEX(' ',ColumnName)+1,LEN(ColumnName))
Matthew Jones
this does not work, CHARINDEX's first param is the string to find not he string to search, you have `CHARINDEX(ColumnName,' ')`, see [CHARINDEX (Transact-SQL)](http://msdn.microsoft.com/en-us/library/ms186323.aspx), not to mention you find the first space not the last
KM
@KM fixed CHARINDEX. I thought based on OP's question it was reasonable to assume that there was only one space in the data.
Matthew Jones
+2  A: 

try this:

DECLARE @YourTable table (RowValue varchar(50))
INSERT @YourTable VALUES ('1234abc')
INSERT @YourTable VALUES ('1234abcde456757')
INSERT @YourTable VALUES ('1234abc Supervisor')
INSERT @YourTable VALUES ('1234abc456 Administrator')

UPDATE @YourTable
    SET RowValue=LEFT(RowValue,4)+RIGHT(RowValue,CHARINDEX(' ',REVERSE(RowValue)))
    FROM @YourTable

SELECT * FROM @YourTable

OUTPUT:

RowValue
--------------------------------------------------
1234
1234
1234 Supervisor
1234 Administrator

(4 row(s) affected)

EDIT: set based any number of digits and handles no digits or no words

DECLARE @YourTable table (RowValue varchar(50))
set nocount on
INSERT @YourTable VALUES ('13')
INSERT @YourTable VALUES ('1234abc')
INSERT @YourTable VALUES ('1234abc')
INSERT @YourTable VALUES ('1234abcde456757')
INSERT @YourTable VALUES ('1234abc Supervisor')
INSERT @YourTable VALUES ('1234abc456 Administrator')
INSERT @YourTable VALUES ('1234567abc456 Administrator')
INSERT @YourTable VALUES ('Administrator')
INSERT @YourTable VALUES ('abcde Administrator')

set nocount off

;WITH Digits AS
(SELECT 0 AS Digit UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9

)
,Numbers AS
(SELECT 1 AS Number
 UNION ALL
 SELECT Number+1 FROM Numbers where Number<1000
)
,FindDigits AS
(
SELECT
    y.RowValue,n.Number,SUBSTRING(y.RowValue,n.Number,1) AS CharOf,CASE WHEN SUBSTRING(y.RowValue,n.Number,1) LIKE '[0-9]' THEN 'N' ELSE 'A' END AS TypeOf
    FROM @YourTable         y
        INNER JOIN Numbers  n ON 1=1
    WHERE n.Number<=LEN(y.RowValue)
)
,LenOf AS
(
SELECT 
    RowValue,MIN(Number)-1 AS Digits
    FROM FindDigits
    WHERE TypeOf='A'
    GROUP BY RowValue
    HAVING MIN(Number)-1>0
UNION
SELECT 
    f.RowValue,LEN(f.RowValue)
    FROM FindDigits f
    WHERE NOT EXISTS (SELECT 1 FROM FindDigits f2 WHERE f.RowValue=f2.RowValue AND TypeOf='A')
)
UPDATE y
    SET RowValue=CASE WHEN l.Digits IS NOT NULL THEN LEFT(y.RowValue,l.Digits)+RIGHT(y.RowValue,CHARINDEX(' ',REVERSE(y.RowValue)))
                      WHEN CHARINDEX(' ',REVERSE(y.RowValue))=0 THEN y.RowValue
                      ELSE RIGHT(y.RowValue,CHARINDEX(' ',REVERSE(y.RowValue))-1) END
    FROM @YourTable       y
        LEFT JOIN LenOf   l ON y.RowValue=l.RowValue
  OPTION (MAXRECURSION 1000)

SELECT * FROM @YourTable

OUTPUT:

RowValue
--------------------------------------------------
13
1234
1234
1234
1234 Supervisor
1234 Administrator
1234567 Administrator
Administrator
Administrator

(9 row(s) affected)
KM
This assumes the number of initial digits is four, not sure if that's the case!
Andomar
@KM - Thank you for the update I'll give it a try.
del.ave
+7  A: 

You could search for the first non-digit and the first space in a subquery. That also works if the number of digits isn't exactly four:

declare @t table (col1 varchar(50))
insert into @t select '12abc'
union all select '1234abcde456757'
union all select '1234abc Supervisor'
union all select '1234abc456 Administrator'
union all select '123456abc456 Administrator'

select  case when FirstNonDigit = 0 then col1
             when FirstSpace = 0 then substring(col1, 1, FirstNonDigit-1)
             else substring(col1, 1, FirstNonDigit-1) + 
                  substring(col1, FirstSpace, len(col1) - FirstSpace + 1)
             end
from    (
        select  patindex('%[^0-9]%', col1) FirstNonDigit
        ,       patindex('% %', col1) FirstSpace
        ,       col1
        from    @t
        ) subqueryalias

-->

12
1234
1234 Supervisor
1234 Administrator
123456 Administrator
Andomar
Very nicely done :)
K Richard
Thanks. Also works when the value is just a string like "Administrator" with no numbers or spaces. I did not know that the table had such values, but aparently it did.
del.ave
doesn't work for values like `Administrator` (you get a leading space) or `abc456 Administrator` (get a totally blank value)
KM
A: 

The code below uses a "tally table" of values to find the first non-numeric character and the last space. KM's solution using PATINDEX is probably more elegant!

DECLARE @t TABLE 
(
   c VARCHAR(MAX)
);

INSERT INTO @t VALUES('1234abc');
INSERT INTO @t VALUES('1234abcde456757');
INSERT INTO @t VALUES('1234abc Supervisor');
INSERT INTO @t VALUES('1234abc456 Administrator');

WITH Tally AS
(
   SELECT ROW_NUMBER() OVER (ORDER BY s1.[id]) AS i
   FROM sys.sysobjects s1 CROSS JOIN sys.sysobjects s2 CROSS JOIN sys.sysobjects s3
), 
NumPart AS
(
   SELECT c, MIN(i) AS firstNonNumber
   FROM @t CROSS JOIN Tally
   WHERE i <= LEN(c)
   AND SUBSTRING(c, i, 1) < '0' OR SUBSTRING(c, i, 1) > '9'
   GROUP BY c 
),
SpacePart AS
(
   SELECT c, MAX(i) AS spacePos
   FROM @t t CROSS JOIN Tally
   WHERE i <=  LEN(c)
   AND SUBSTRING(c, i, 1) = ' '
   GROUP BY c
)
UPDATE t
SET t.c = LEFT(n.c, n.firstNonNumber - 1) + 
   CASE WHEN ISNULL(s.SpacePos, 0) > 0 THEN 
      RIGHT(n.c, LEN(n.c) - s.SpacePos + 1) 
   ELSE 
      ''
   END
FROM @t t
INNER JOIN NumPart n ON t.c = n.c
LEFT JOIN SpacePart s ON n.c = s.c;

SELECT * FROM @t;
Ken Keenan