tags:

views:

718

answers:

5

The problem is I need to ignore the stray Letters in the numbers: e.g. 19A or B417

+1  A: 

It depends on how much data you're dealing with, but doing that in SQL is probably going to be slow. Not everyone will agree with me here, but I think all data processing should be done in application code.

I would just take the rows you want, and filter it in the application you're dealing with.

Alex Fort
The problem with "all data processing should be done in application code" is, when you have table with 1000000 records and application comunications over internet. Passing all data to client and then back is very, very expensive.
TcKs
Well, if your DB schema requires you to `select *` just to get a few pertinent rows, maybe you have bigger problems than dealing with latency. :P
Alex Fort
Hopefully you have insertion, validation, and transformation code that isn't located across the internet; and that you don't regularly need to transform that many records with it. I'm assuming good design.
le dorfier
A: 

The easiest thing to do here would be to create a CLR function which takes the address. In the CLR function, you would take the first part of the address (assuming it is the house number), which should be delimited by whitespace.

Then, replace any non-numeric characters with an empty string.

You should have a string representing an integer at that point which you can pass to the Parse method on the Int32 class to produce an integer, which you can then check to see if it is odd.

I recommend a CLR function (assuming you are using SQL Server 2005 and above, and can set the compatibility level of the database) here because it's easier to perform string manipulations in .NET than it is in T-SQL.

casperOne
Using CLR for this function is littlebit overhead.
TcKs
Absolutely not. CLR functions are actually MUCH better for procedural operations when compared to T-SQL. T-SQL is optimized for set operations and is not a general-purpose language, which is what .NET is. It's a better tool for the job when it comes to string manipulation.
casperOne
Using the wrong tool. Procedural code should go in procedural abstraction levels.
le dorfier
Using that logic, you would have to forbid the use of pretty much any of the built-in functions as well as operators on numeric types, which makes no sense. I agree that there should be logical splits between business logic and data access, but that's different for every app.
casperOne
Yes, for query optimization it's well known that queries based on function values are non-SARGable. You're right, you would (and do) need to avoid them (not forbid them, we're talking about a choice here.)
le dorfier
And it's likely you aren't going to want to replace anything, because the user entered the address the way the intend it to be permanently stored.
le dorfier
@le dorfer: it may be well known, but unfortunately it's wrong. SARGability has nothing to do with use of a function itself, it has to do with the use of any processing expression, (function or otherwise) on a table column directly. Writing Where MyColumn = Function(@InParameter) IS SARGable.
Charles Bretana
@Charles, I can only be pursuaded by authoritative sources. Do you have any references for your assertion? :) BTW, I agree it's about "any processing expression ...". But every reference I've seen suggests avoiding them all. However, this discussion is about functions.
le dorfier
+4  A: 

Take a look here: Extracting Numbers with SQL Server

There are several hidden "gotcha's" that are explained pretty well in the article.

G Mastros
I feel dumb writing this, but you should include a sample that shows how to use that function AND identify odd numbers. Then it's a complete answer to the question. Sorry for being a pedant . . . it's just that I can't help myself.
Binary Worrier
I imagine he doesn't have a final algorithm yet - just some ideas that will need to be a starting point for a development process.
le dorfier
Thanks, the function in the article got me started:CREATE FUNCTION dbo.GetNumbers(@DATA VARCHAR(8000))RETURNS VARCHAR(8000)ASBEGIN RETURN LEFT( SUBSTRING(@DATA, PATINDEX('%[0-9.-]%', @DATA), 8000), PATINDEX('%[^0-9.-]%', SUBSTRING(@DATA, PATINDEX('%[0-9.-]%', @DATA), 8000) + 'X')-1) ENDI modded it to just do positive integers and did dbo.getnumbers(column) %2 = 1 One note about the original function if anyone is using it. It does not strip '-' off the end of numbers which would be a problem for me.
dwidel
A: 

Assuming [Address] is the column with the address in it...

  Select Case Cast(Substring(Reverse(Address), PatIndex('%[0-9]%',
       Reverse(Address)), 1) as Integer) % 2 
     When 0 Then 'Even' 
     When 1 Then 'Odd' End
  From Table
Charles Bretana
A: 

I've been through this drill before. The best alternative is to add a column to the table or to a subsidiary joinable table that stores the inferred numerical value for the purpose. Then use iterative queries to set the column repeatedly until you get sufficient accuracy and coverage. You'll end up encountering stuff like "First, Third," "451a", "1200 South 19th Blvd East", and worse.

Then filter new and edited records as they occur.

As usual, UDF's should be avoided as being slow and (comparatively) less debuggable.

le dorfier
Inline UDFs are not slow... they are combined with the SQL in the query and optimized into the query execution plan along with the rest of the SQL...
Charles Bretana
If they **were** for a query plan, you'd be in worse trouble. Even SQL functions should be avoided as being usually non-SARGable.
le dorfier
@le dorfer, see my comment above... and read up on inline UDFs...
Charles Bretana
I'm not sure what point you want to make. All I see are unsupported assertions. I've read about, and written, plenty of UDF's (and refactored a few out, too.) :) Do you want to suggest specific references?
le dorfier