views:

54

answers:

2

A have a string %c3%ad which decoded with UTF-8 is í, but decoded with ASCII is Ã.

I need to decode it using the UTF-8 encoding, how can I do that?

Here is a select of the value and what it should be...

SELECT
('%c3%81') as 'Á (81 = 129)',
('%c3%89') as 'É (89 = 137)',
('%c3%8d') as 'Í (8d = 141)',
('%c3%93') as 'Ó (93 = 147)',
('%c3%9a') as 'Ú (9a = 154)'


SELECT
('%c3%a1') as 'á (a1 = 161)',
('%c3%a9') as 'é (a9 = 169)',
('%c3%ad') as 'í (ad = 173)',
('%c3%b3') as 'ó (b3 = 179)',
('%c3%ba') as 'ú (ba = 186)'
A: 

Simply trying

print cast(0xc3ad as nvarchar(max)) 

Returns

I suspect that it might be possible to use some XML method for this. Here's my (probably hopelessly naive) attempt that none the less returns the right answer.

declare @stringToDecode varchar(max) = '%c3%ad'

declare @binaryVersion varbinary(max) = CONVERT(varbinary(max), '0x' + 
                                             REPLACE(@stringToDecode,'%',''), 1)

declare @x xml  = 
     cast('<?xml version="1.0" encoding="UTF-8"?><test>' as varbinary(max))
                                     + @binaryVersion
                                     + cast('</test>' as varbinary(max))

declare @result   NVARCHAR(max)  = @x.value('/test[1]/.', 'NVARCHAR(max)') 

print @result

Returns í

Martin Smith
+1  A: 

This functions seems to do the job.

CREATE FUNCTION [dbo].[UrlDecodeUTF8](@URL varchar(3072))
RETURNS varchar(3072)
AS
BEGIN 
    DECLARE @Position INT,
        @Base CHAR(16),
        @Code INT,
        @Pattern CHAR(21)

    SELECT @URL = REPLACE(@URL, '%c3', '')

    SELECT  @Base = '0123456789abcdef',
        @Pattern = '%[%][0-9a-f][0-9a-f]%',
        @Position = PATINDEX(@Pattern, @URL)

    WHILE @Position > 0
        SELECT @Code = Cast(CONVERT(varbinary(4), '0x' + SUBSTRING(@URL, @Position + 1, 2), 1) As int),
            @URL = STUFF(@URL, @Position, 3, NCHAR(@Code + 64)),
            @Position = PATINDEX(@Pattern, @URL)

    RETURN REPLACE(@URL, '+', ' ')

END
BrunoLM