How could you remove all characters that are not alphabetic from a string?
What about non-alphanumeric?
Does this have to be a custom function or are there also more generalizable solutions?
How could you remove all characters that are not alphabetic from a string?
What about non-alphanumeric?
Does this have to be a custom function or are there also more generalizable solutions?
I don't think there is any SQL based functions which can be easily used to do this.
But here is an article on writing the function you're looking for: SQL STrim Function
I knew that SQL was bad at string manipulation, but I didn't think it would be this difficult. Here's a simple function to strip out all the numbers from a string. There would be better ways to do this, but this is a start.
CREATE FUNCTION dbo.AlphaOnly (
@String varchar(100)
)
RETURNS varchar(100)
AS BEGIN
RETURN (
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
@String,
'9', ''),
'8', ''),
'7', ''),
'6', ''),
'5', ''),
'4', ''),
'3', ''),
'2', ''),
'1', ''),
'0', '')
)
END
GO
-- ==================
DECLARE @t TABLE (
ColID int,
ColString varchar(50)
)
INSERT INTO @t VALUES (1, 'abc1234567890')
SELECT ColID, ColString, dbo.AlphaOnly(ColString)
FROM @t
Output
ColID ColString
----- ------------- ---
1 abc1234567890 abc
Round 2 - Data-Driven Blacklist
-- ============================================
-- Create a table of blacklist characters
-- ============================================
IF EXISTS (SELECT * FROM sys.tables WHERE [object_id] = OBJECT_ID('dbo.CharacterBlacklist'))
DROP TABLE dbo.CharacterBlacklist
GO
CREATE TABLE dbo.CharacterBlacklist (
CharID int IDENTITY,
DisallowedCharacter nchar(1) NOT NULL
)
GO
INSERT INTO dbo.CharacterBlacklist (DisallowedCharacter) VALUES (N'0')
INSERT INTO dbo.CharacterBlacklist (DisallowedCharacter) VALUES (N'1')
INSERT INTO dbo.CharacterBlacklist (DisallowedCharacter) VALUES (N'2')
INSERT INTO dbo.CharacterBlacklist (DisallowedCharacter) VALUES (N'3')
INSERT INTO dbo.CharacterBlacklist (DisallowedCharacter) VALUES (N'4')
INSERT INTO dbo.CharacterBlacklist (DisallowedCharacter) VALUES (N'5')
INSERT INTO dbo.CharacterBlacklist (DisallowedCharacter) VALUES (N'6')
INSERT INTO dbo.CharacterBlacklist (DisallowedCharacter) VALUES (N'7')
INSERT INTO dbo.CharacterBlacklist (DisallowedCharacter) VALUES (N'8')
INSERT INTO dbo.CharacterBlacklist (DisallowedCharacter) VALUES (N'9')
GO
-- ====================================
IF EXISTS (SELECT * FROM sys.objects WHERE [object_id] = OBJECT_ID('dbo.StripBlacklistCharacters'))
DROP FUNCTION dbo.StripBlacklistCharacters
GO
CREATE FUNCTION dbo.StripBlacklistCharacters (
@String nvarchar(100)
)
RETURNS varchar(100)
AS BEGIN
DECLARE @blacklistCt int
DECLARE @ct int
DECLARE @c nchar(1)
SELECT @blacklistCt = COUNT(*) FROM dbo.CharacterBlacklist
SET @ct = 0
WHILE @ct < @blacklistCt BEGIN
SET @ct = @ct + 1
SELECT @String = REPLACE(@String, DisallowedCharacter, N'')
FROM dbo.CharacterBlacklist
WHERE CharID = @ct
END
RETURN (@String)
END
GO
-- ====================================
DECLARE @s nvarchar(24)
SET @s = N'abc1234def5678ghi90jkl'
SELECT
@s AS OriginalString,
dbo.StripBlacklistCharacters(@s) AS ResultString
Output
OriginalString ResultString
------------------------ ------------
abc1234def5678ghi90jkl abcdefghijkl
My challenge to readers: Can you make this more efficient? What about using recursion?
Try this function:
Create Function [dbo].[RemoveNonAlphaCharacters](@Temp VarChar(1000))
Returns VarChar(1000)
AS
Begin
While PatIndex('%[^a-z]%', @Temp) > 0
Set @Temp = Stuff(@Temp, PatIndex('%[^a-z]%', @Temp), 1, '')
Return @TEmp
End
Call it like this:
Select dbo.RemoveNonAlphaCharacters('abc1234def5678ghi90jkl')
Once you understand the code, you should see that it is relatively simple to change it to remove other characters, too. You could even make this dynamic enough to pass in your search pattern.
Hope it helps.
Parameterized version of G Mastros' awesome answer:
CREATE FUNCTION [dbo].[fn_StripCharacters]
(
@String NVARCHAR(MAX),
@MatchExpression VARCHAR(255)
)
RETURNS NVARCHAR(MAX)
AS
BEGIN
SET @MatchExpression = '%['+@MatchExpression+']%'
WHILE PatIndex(@MatchExpression, @String) > 0
SET @String = Stuff(@String, PatIndex(@MatchExpression, @String), 1, '')
RETURN @String
END
Alphabetic only:
SELECT dbo.fn_StripCharacters('a1!s2@d3#f4$', '^a-z')
Numeric only:
SELECT dbo.fn_StripCharacters('a1!s2@d3#f4$', '^0-9')
Alphanumeric only:
SELECT dbo.fn_StripCharacters('a1!s2@d3#f4$', '^a-z0-9')
Non-alphanumeric:
SELECT dbo.fn_StripCharacters('a1!s2@d3#f4$', 'a-z0-9')
This is an excellent function, but I think that if wanted to strip all non-letters and non-numbers, you would want '^a-z0-9' (versus '^a-z^0-9', which would leave ^ in the string).