Is it possible to make efficient queries that use the complete regular expression feature set.
If not Microsoft really should consider that feature.
Is it possible to make efficient queries that use the complete regular expression feature set.
If not Microsoft really should consider that feature.
For SQL Server 2000, there is xp_pcre, which introduces Perl compatible regular expressions as a set of extended stored procedures. I've used it, it works.
The more recent versions give you direct access to the .NET integrated regular expressions.
The answer is no, not in the general case, although it might depend on what you mean by efficient. For these purposes, I'll use the following definition: 'Makes effective use of indexes and joins in a sensible order' which is probably as good as any.
In this case, 'Efficient' queries are 's-arg'-able, which means that they can use index lookups to narrow down search predicates. Equalities (t-joins) and simple inequalities can do this. 'AND' predicates can also do this. After that, we get into table, index and range scanning - i.e. operations that have to do record-by-record (or index-keyby index-key) comparisons.
Sontek's answer describes a method of in-lining regexp functionality into a query, but the operations still have to do comparisons on a record by record basis. Wrapping it up in a function would allow a function-based index where the result of a calculation is materialised in the index (Oracle supports this and you can get equivalent functionality in SQL Server by using the sort of tricks discussed in this article). However, you could not do this for an arbitrary regexp.
In the general case, the semantics of a regular expression do not lend themselves to pruning match sets in the sort of way that an index does, so integrating rexegp support into the query optimiser is probably not possible.
I think we can see from the new types in SQL Server 2008 (hierarchyid, geo-spatial) that if Microsoft do add this it will come in the form of a SQL CLR Assembly
If you are able to install Assemblies into your database you could roll your own by creating a new Database\SQL Server project in Visual Studio - this will allow you to make a new Trigger / UDF / Stored Proc / Aggregate or UDT. You could import System.Text.RegularExpressions into the class and go from there.
Hope this helps
I would love to have the ability to natively call regular expressions in SQL Server for ad hoc queries and use in stored procedures. Our DBA's won't allow us to create CLR functions so I have been using LINQ Pad as a kind of poor man's query editor for the ad hoc stuff. It is especially useful when working with structured data such as JSON or XML that has been saved to the database.
And I agree that it seems like an oversight that there is no regular expression support, it seems like an obvious feature for a query language. Hopefully we will see it in a future version but people have been asking for it for a long time and it hasn't made it's way into the product yet.
The most frequent reason I have seen against it is that a poorly formed expression can cause catastrophic backtracking which in .NET will not abort and almost always requires the machine to be restarted. Maybe once they address that in the framework we will see it included in a future version of SQL Server.