tags:

views:

2850

answers:

10

I have a column of data that contains a percentage range as a string that I'd like to convert to a number so I can do easy comparisons.

Possible values in the string: '<5%'
'5-10%'
'10-15%'
...
'95-100%'

I'd like to convert this in my select where clause to just the first number, 5, 10, 15, etc. so that I can compare that value to a passed in "at least this" value.

I've tried a bunch of variations on substring, charindex, convert, and replace, but I still can't seem to get something that works in all combinations.

Any ideas?

A: 

You can convert char data to other types of char (convert char(10) to varchar(10)), but you won't be able to convert character data to integer data from within SQL.

Shazburg
A: 

This is what I have so far:

select substring(test,0,charindex('-',test)) 
from (select replace(replace(interest,'<',''),'%','') test
      from table1 where interest is not null) q1 order by test

But it doesn't work for the "<5%" value since that one doesn't have the "-" character for the charindex check.

Geoff
A: 

I don't know if this works in SQL Server, but within MySQL, you can use several tricks to convert character data into numbers. Examples from your sample data:

"<5%"     => 0
"5-10%"   => 5
"95-100%" => 95

now obviously this fails your first test, but some clever string replacements on the start of the string would be enough to get it working.

One example of converting character data into numbers:

SELECT "5-10%" + 0 AS foo ...

Might not work in SQL Server, but future searches may help the odd MySQL user :-D

mercutio
A: 

You'd probably be much better off changing <5% and 5-10% to store 2 values in 2 fields. Instead of storing <5%, you would store 0, and 5, and instead of 5-10%, yould end up with 5 and 10. You'd end up with 2 columns, one called lowerbound, and one called upperbound, and then just check value >= lowerbound AND value < upperbound.

Kibbee
+1  A: 

@Kibbee & Jason: I know the design is crap. Can't do anything about it. Legacy database created +8years ago. Just have to deal with the garbage.

And procedural restrictions prevent me from creating new tables

Geoff
+5  A: 

Try this,

SELECT substring(replace(interest , '<',''), patindex('%[0-9]%',replace(interest , '<','')), patindex('%[^0-9]%',replace(interest, '<',''))-1) FROM table1

Tested at my end and it works, it's only my first try so you might be able to optimise it.

Martin
will not work if say the value in column is ABCD2
Thunder
A: 

You can do this in sql server with a cursor. If you can create a CLR function to pull out number groupings that will help. Its possible in T-SQL, just will be ugly.

Create the cursor to loop over the list. Find the first number, If there is only 1 number group in their then return it. Otherwise find the second item grouping.

if there is only 1st item grouping returned and its the first item in the list set it to upper bound. if there is only 1st item grouping returned and its the last item in the list set it to lower bound. Otherwise set the 1st item grouping to lower, and the 2nd item grouping to upper bound

Just set the resulting values back to a table

AdamSane
A: 

The issue you are having is a symptom of not keeping the data atomic. In this case it looks purely unintentional (Legacy) but here is a link about it.

To design yourself out of this create a range_lookup table:

Create table rangeLookup(
    rangeID int  -- or rangeCD or not at all
    ,rangeLabel varchar(50)
    ,LowValue int--real or whatever
    ,HighValue int 
)

To hack yourself out here some pseudo steps this will be a deeply nested mess.

normalize your input by replacing all your crazy charecters.
    replace(replace(rangeLabel,"%",""),"<","")
    --This will entail many nested replace statments.

Add a CASE and CHARINDEX to look for a space if there is none you have your number
    else use your substring to take everything before the first " ".
    -- theses steps are wrapped around the previous step.
jms
+2  A: 

@Martin: Your solution works.

Here is another I came up with based on inspiration from @mercutio

select cast(replace(replace(replace(interest,'<',''),'%',''),'-','.0') as numeric) test
from table1 where interest is not null
Geoff
A: 

It's complicated, but for the test cases you provided, this works. Just replace @Test with the column you are looking in from your table.

DECLARE @TEST varchar(10)

set @Test =  '<5%'
--set @Test =  '5-10%'
--set @Test =  '10-15%'
--set @Test =  '95-100%'

Select CASE WHEN 
Substring(@TEST,1,1) = '<' 
THEN 
0
ELSE 
CONVERT(integer,SUBSTRING(@TEST,1,CHARINDEX('-',@TEST)-1))
END
AS LowerBound
,
CASE WHEN 
Substring(@TEST,1,1) = '<'
THEN
CONVERT(integer,Substring(@TEST,2,CHARINDEX('%',@TEST)-2))
ELSE
CONVERT(integer,Substring(@TEST,CHARINDEX('-',@TEST)+1,CHARINDEX('%',@TEST)-CHARINDEX('-',@TEST)-1))
END
AS UpperBound
Kibbee