views:

104

answers:

7

Hello,

I have a big list of int parameters for a SQL query:

update mytable set col='xyz'
where id in (/* thousands of ints */)

My problem is that in SQL Server 2000 there are a limit for parameters. I could run this query on an SQL Server 2008 too.

What is the better way to do this.

Edit:

The list of Ids come from a C# program. Not from another table.

+2  A: 

You can insert the integers into a temp table, and then query like this:

update mytable m set col='xyz' 
where exists (select * from #MyTempTable where id = m.id)
RedFilter
Is this a good idea? the excecution-time for 5000x Insert into #MyTempTable could be exorbitant.
Floyd
`IN` is **not** a good option if the list is more than a few. `EXISTS` is the way to go.
Brad
@Brad: correct, updated.
RedFilter
@Floyd: You can do this in one insert via `UNION ALL`.
RedFilter
@RedFilter, my down-vote is removed. <rant>This place is like telling jokes: if you're a comedian, people assume you're funny and laugh at almost anything. I could tell the same joke and no one would laugh. Here, people vote for the guy with the most rep even when there's a better solution on the page.</rant>
Brad
@RedFilter: `Insert into @t Select 869 UNION ALL Select 1074 UNION ALL ...` results in an error: `Msg 8621, Level 17, State 88, Line 5` `Internal Query Processor Error: The query processor ran out of stack space during query optimization.` Tested with 1000 Ids, maximum 24 GB Memory for the Database Server
Floyd
+1  A: 

At all costs, AVOID IN; especially if you are post-2000. my backup


Instead, use EXISTS

UPDATE myTable
SET col = 'newValue'
FROM myTable 
WHERE EXISTS (
      SELECT * 
      FROM @myTempTable temp
      WHERE myTable.ID = temp.ID)
Brad
same as RedFilters post: the excecution-time for 5000x Insert into #MyTempTable could be exorbitant.
Floyd
I should think 5000 inserts of a single `INT` wouldn't be so bad. Is there another way you can get the list of `INT`s via a query?
Brad
No i can't. Thats my problem. The List of Ids generate by an program.
Floyd
How about give it a try and benchmark it. When you create the temp table, cluster it on the `ID` and have the list pre-sorted before it gets to the database.
Brad
Also, if you use an `@` table rather than a `#` table, it will be in RAM versus on the platter. That should speed things up a bit too, but keep this in mind if the 5,000 grows to 500,000.
Brad
`Insert into @t Select 869 UNION ALL Select 1074 UNION ALL ...` results in an error: `Msg 8621, Level 17, State 88, Line 5` `Internal Query Processor Error: The query processor ran out of stack space during query optimization.` Tested with 1000 Ids, maximum 24 GB Memory for the Database Server
Floyd
I'd strongly disagree with the first sentence (but haven't downvoted yet). IN expresses the intent, exactly. the EXISTS form turns it into a correlated subquery, that may perform horribly as the size of data goes up.
Damien_The_Unbeliever
@Damien, it's actually the `IN` that performs **horribly** as the dataset increases. I agree that `IN` is verbally more intuitive, but I can tell you from experience that in 2005, they changed something in the engine regarding `IN` -- it doesn't perform the way it used to on large sets.
Brad
That blog article you linked as "your backup" is pretty horrible. It does not claim anything about performance, instead using some contrived examples to claim that IN has bad practice due to some odd but technically possible possible pitfalls.
Coxy
+2  A: 

An alternate approach that works with SQL 2000 is to use XML.

Have the program/application format the ints like so:

'<root><TMP J="111"/><TMP J="222"/><TMP J="333"/></root>'

.
Then create the following stored procedure:

CREATE PROCEDURE UpDateIntsFromXML (
    @sXML TEXT
)
AS
    DECLARE @iDoc INT
    EXEC    sp_xml_preparedocument @iDoc OUTPUT, @sXML

    UPDATE  YourTable
    SET     YourColumn = 'fixed value'
    FROM    OPENXML (@iDoc, '/root/TMP', 1) WITH (J INT) AS X
    WHERE   X.J = YourTable.IntColumn

    EXEC    sp_xml_removedocument @iDoc
RETURN

.
Then your application can call that SP, passing a potentially huge block of text/XML.

Observe that root, TMP, and J are all case-sensitive.

Brock Adams
This way could be work, we will try this. We have also find two another ways: *at first*: [Table-Valued Parameters](http://lennilobel.wordpress.com/2009/07/29/sql-server-2008-table-valued-parameters-and-c-custom-iterators-a-match-made-in-heaven/) *or* SqlBulkCopy .. we will try this too.
Floyd
+1, I like this. Floyd, let us know if this works.
Brad
This works but Table-Valued-Parameters are faster. but this feature aren't work with SQL-2000.
Floyd
You mean that Table-Valued-Parameters don't work with SQL-2000. XML works with 2000, and later.
Brock Adams
A: 

Divide the data into smaller groups, and execute multiple update queries.

There is no reason to use a temp table, since you retrieve the data from outside the db, so there is no way to avoid its transfer towards the db.

Andy
This works with `IN` or `EXISTS` but not with `NOT IN` or `NOT EXISTS`
Floyd
But the query does use 'IN', not 'NOT IN'. You execute the same query but multiple times, on smaller sets of data...
Andy
But my question whas "bit parameter list for sql query" not "bit parameter list for an "in"-query" .. the query was only an exampel.
Floyd
Well then create a temp table and do whatever u want with it.
Andy
This is not my problem .. my problem is the transport form the program to the database. With `Insert into tempTable` i get an error. please read the comments for posting too.
Floyd
A: 

If the ints are in any way sequential (more than two at once), you could make them into BETWEEN pairs.

But in this case, just make a string of these ints and pass that as a single varchar(max) parameter.

Pasi Savolainen
A: 

I think you probably want to create a memory based temporary table with an index. Assuming the table you are querying against is large you would not want to do a table scan comparing each row against each of your 5000 matches. You want to do a join with the help of two indexes.

CREATE TEMPORARY TABLE IF NOT EXISTS inputlist
(i INT PRIMARY KEY) ENGINE = MEMORY;

INSERT INTO inputlist (i) VALUES (1),(2),(3),(1000),(2000),(5000);

SELECT * FROM your_table JOIN inputlist ON your_table.intvalues = inputlist.i;

DROP TEMPORARY TABLE inputlist;

SQL based on MySQL, see:
http://dev.mysql.com/doc/refman/5.1/en/memory-storage-engine.html
http://dev.mysql.com/doc/refman/5.1/en/insert.html
http://dev.mysql.com/doc/refman/5.1/en/create-table.html

David Parks
this doenst work with sql-2000 or sql-2008. If i try too do this with singel inserts (multi-insets arnt supported in sql-2000) i get an error: `Msg 8621, Level 17, State 88, Line 5 Internal Query Processor Error: The query processor ran out of stack space during query optimization.` I would send the inserts in blocks but this takes many time for transport and statement-compiling.
Floyd
David Parks
This soulution (from the source of the quicktest) works but just a with very few ids. not with hundreds and not with thousends.
Floyd
Sad, well. I'll be interested to see what comes up as the best solution. Make sure you post it here if it's not one of the options proposed. Fascinating question.
David Parks
For what it's worth this blog says that you can do multiple value inserts with SQL 2008 (sorry, my primary experience is with MySQL): http://blogs.microsoft.co.il/blogs/bursteg/archive/2007/12/05/sql-server-2008-t-sql-insert-multiple-rows.aspxAlso, here's an option for passing in a comma delimited list to a stored procedure. You might try using this example to populate an in-memory temp table as suggested above: http://www.daniweb.com/forums/thread34491.htmlAgain, you'll have to try it and profile it to see if it's a usable solution, but it shouldn't take too much effort for a test case
David Parks
I have post the best working solution for me.
Floyd
+2  A: 

The best working soulution for me was SQL Server 2008: Table Valued Parameters

100000 Ids needs 14-20s, 1000 Ids needs ~140ms.

sql = @"
  update MyTable
    set Col1 = 1
    where ID in (select * from @ids)
  ";
sqlCmd = new SqlCommand {Connection = _sqlConn, CommandText = sql};

//Create a DataTable with one Column("id") and all ids as DataRows
DataTable listOfLeadIDs = new DataTable();
listOfIDs.Columns.Add("id", typeof(int));
Ids.ToList<string>().ForEach(x => listOfIDs.Rows.Add(new object[] { int.Parse(x) }));

//Bind this DataTable to the Command-object
// Node: "IntTable" is an User-Defined-Table-Typ (new feature with SQL-2008)
sqlCmd.Parameters.Add(
  new System.Data.SqlClient.SqlParameter("@ids", listOfIDs) { 
    TypeName = "IntTable" 
  });

//Execute the Query
sqlCmd.ExecuteNonQuery();

The User-Defined-Table-Typ:

CREATE TYPE [dbo].[IntTable] AS TABLE(
    [id] [int] NULL
)
GO
Floyd