ansaurus

Question

SQL set-based range

Answer 1

A:

If using SQL Server 2000 or greater, you could use the table datatype to avoid creating a normal or temporary table. Then use the normal table operations on it.

With this solution you have essentially a table structure in memory that you can use almost like a real table, but much more performant.

I found a good discussion here: Temporary tables vs the table data type

Sergio Acosta 2008-09-12 08:36:01

Answer 2

A:

This is basically one of those things that reveal SQL to be less than ideal. I'm thinking maybe the right way to do this is to build a function that creates the range. (Or a generator.)

I believe the correct answer to your question is basically, "you can't". (Sorry.)

Anders Eurenius 2008-09-12 14:56:22

I get why this is modded down. It's correct, but not particularly helpful. (Unless you, like me, think fail-fast is a good idea here.)

Anders Eurenius 2008-09-19 18:33:00

Answer 3

+4 A:

I think the very short answer to your question is to use WITH clauses to generate your own.

Unfortunately, the big names in databases don't have built-in queryable number-range pseudo-tables. Or, more generally, easy pure-SQL data generation features. Personally, I think this is a huge failing, because if they did it would be possible to move a lot of code that is currently locked up in procedural scripts (T-SQL, PL/SQL, etc.) into pure-SQL, which has a number of benefits to performance and code complexity.

So anyway, it sounds like what you need in a general sense is the ability to generate data on the fly.

Oracle and T-SQL both support a WITH clause that can be used to do this. They work a little differently in the different DBMS's, and MS calls them "common table expressions", but they are very similar in form. Using these with recursion, you can generate a sequence of numbers or text values fairly easily. Here is what it might look like...

In Oracle SQL:

WITH
  digits AS  -- Limit recursion by just using it for digits.
    (SELECT
      LEVEL - 1 AS num
    FROM
      DUAL
    WHERE
      LEVEL < 10
    CONNECT BY
      num = (PRIOR num) + 1),
  numrange AS
    (SELECT
      ones.num
        + (tens.num * 10)
        + (hundreds.num * 100)
        AS num
    FROM
      digits ones
      CROSS JOIN
        digits tens
      CROSS JOIN
        digits hundreds
    WHERE
      hundreds.num in (1, 2)) -- Use the WHERE clause to restrict each digit as needed.
SELECT
  -- Some columns and operations
FROM
  numrange
  -- Join to other data if needed

This is admittedly quite verbose. Oracle's recursion functionality is limited. The syntax is clunky, it's not performant, and it is limited to 500 (I think) nested levels. This is why I chose to use recursion only for the first 10 digits, and then cross (cartesian) joins to combine them into actual numbers.

I haven't used SQL Server's Common Table Expressions myself, but since they allow self-reference, recursion is MUCH simpler than it is in Oracle. Whether performance is comparable, and what the nesting limits are, I don't know.

At any rate, recursion and the WITH clause are very useful tools in creating queries that require on-the-fly generated data sets. Then by querying this data set, doing operations on the values, you can get all sorts of different types of generated data. Aggregations, duplications, combinations, permutations, and so on. You can even use such generated data to aid in rolling up or drilling down into other data.

UPDATE: I just want to add that, once you start working with data in this way, it opens your mind to new ways of thinking about SQL. It's not just a scripting language. It's a fairly robust data-driven declarative language. Sometimes it's a pain to use because for years it has suffered a dearth of enhancements to aid in reducing the redundancy needed for complex operations. But nonetheless it is very powerful, and a fairly intuitive way to work with data sets as both the target and the driver of your algorithms.

Chris Ammerman 2008-09-12 15:30:35

I agree that this is a huge failing. I found myself needing this type of thing many times and resorted to loops.

Hafthor 2008-09-22 17:11:23

Answer 4

+2 A:

I created a SQL CLR table valued function that works great for this purpose.

SELECT n FROM dbo.Range(1, 11, 2) -- returns odd integers 1 to 11
SELECT n FROM dbo.RangeF(3.1, 3.5, 0.1) -- returns 3.1, 3.2, 3.3 and 3.4, but not 3.5 because of float inprecision. !fault(this)

Here's the code:

using System;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
using System.Collections;

[assembly: CLSCompliant(true)]
namespace Range {
    public static partial class UserDefinedFunctions {
        [Microsoft.SqlServer.Server.SqlFunction(DataAccess = DataAccessKind.None, IsDeterministic = true, SystemDataAccess = SystemDataAccessKind.None, IsPrecise = true, FillRowMethodName = "FillRow", TableDefinition = "n bigint")]
        public static IEnumerable Range(SqlInt64 start, SqlInt64 end, SqlInt64 incr) {
            return new Ranger(start.Value, end.Value, incr.Value);
        }

        [Microsoft.SqlServer.Server.SqlFunction(DataAccess = DataAccessKind.None, IsDeterministic = true, SystemDataAccess = SystemDataAccessKind.None, IsPrecise = true, FillRowMethodName = "FillRowF", TableDefinition = "n float")]
        public static IEnumerable RangeF(SqlDouble start, SqlDouble end, SqlDouble incr) {
            return new RangerF(start.Value, end.Value, incr.Value);
        }

        public static void FillRow(object row, out SqlInt64 n) {
            n =  new SqlInt64((long)row);
        }

        public static void FillRowF(object row, out SqlDouble n) {
            n = new SqlDouble((double)row);
        }
    }

    internal class Ranger : IEnumerable {
        Int64 _start, _end, _incr;

        public Ranger(Int64 start, Int64 end, Int64 incr) {
            _start = start; _end = end; _incr = incr;
        }

        public IEnumerator GetEnumerator() {
            return new RangerEnum(_start, _end, _incr);
        }
    }

    internal class RangerF : IEnumerable {
        double _start, _end, _incr;

        public RangerF(double start, double end, double incr) {
            _start = start; _end = end; _incr = incr;
        }

        public IEnumerator GetEnumerator() {
            return new RangerFEnum(_start, _end, _incr);
        }
    }

    internal class RangerEnum : IEnumerator {
        Int64 _cur, _start, _end, _incr;
        bool hasFetched = false;

        public RangerEnum(Int64 start, Int64 end, Int64 incr) {
            _start = _cur = start; _end = end; _incr = incr;
            if ((_start < _end ^ _incr > 0) || _incr == 0)
                throw new ArgumentException("Will never reach end!");
        }

        public long Current {
            get { hasFetched = true; return _cur; }
        }

        object IEnumerator.Current {
            get { hasFetched = true; return _cur; }
        }

        public bool MoveNext() {
            if (hasFetched) _cur += _incr;
            return (_cur > _end ^ _incr > 0);
        }

        public void Reset() {
            _cur = _start; hasFetched = false;
        }
    }

    internal class RangerFEnum : IEnumerator {
        double _cur, _start, _end, _incr;
        bool hasFetched = false;

        public RangerFEnum(double start, double end, double incr) {
            _start = _cur = start; _end = end; _incr = incr;
            if ((_start < _end ^ _incr > 0) || _incr == 0)
                throw new ArgumentException("Will never reach end!");
        }

        public double Current {
            get { hasFetched = true; return _cur; }
        }

        object IEnumerator.Current {
            get { hasFetched = true; return _cur; }
        }

        public bool MoveNext() {
            if (hasFetched) _cur += _incr;
            return (_cur > _end ^ _incr > 0);
        }

        public void Reset() {
            _cur = _start; hasFetched = false;
        }
    }
}

and I deployed it like this:

create assembly Range from 'Range.dll' with permission_set=safe -- mod path to point to actual dll location on disk.
go
create function dbo.Range(@start bigint, @end bigint, @incr bigint)
  returns table(n bigint)
  as external name [Range].[Range.UserDefinedFunctions].[Range]
go
create function dbo.RangeF(@start float, @end float, @incr float)
  returns table(n float)
  as external name [Range].[Range.UserDefinedFunctions].[RangeF]
go

Hafthor 2008-09-12 18:12:04

Awesome... one thing I would do different is in your logic to test if we're ever going to complete, you should really do this: if (Math.Sign(_end - _start) != Math.Sign(_incr)) throw new ArgumentException(""Will never reach end!");

IDisposable 2009-08-15 06:50:58

Also, you could probably use a generic for all the base numeric types (byte, int, etc....)

IDisposable 2009-08-15 06:51:50

Answer 5

A:

Here's a hack you should never use:

select sum(numberGenerator.rank)
from
(
select
    rank =  ( select count(*)  
           from reallyLargeTable t1 
              where t1.uniqueValue > t2.uniqueValue ), 
    t2.uniqueValue id1, 
    t2.uniqueValue id2
from reallyLargeTable t2 
) numberGenerator
where rank between 1 and 10

You can simplify this using the Rank() or Row_Number functions in SQL 2005

ilitirit 2008-09-30 16:55:02

ansaurus

tags:

views:

answers:

SQL set-based range

related questions