I have a table with large number of rows(~200 million) and I want to process these values in c#, after reading them from memory. Processing requires grouping entries by column values in a way that can't be done inside the sql server itself. Problem is that reading the whole data at once gives me a OutOfMemory exception, and takes a lot of time to execute even partially.
So I want to break my query into shorter pieces. One method is to obviously do an independent select and then use the where in clause. Another method that I have been suggested is to use sql cursors. I want to chose one of these methods(or another one if possible), especially with regards to the following points:
- What would be the performance impact of the schemes on the server? Which would perform faster?
- Can I safely parallelize the sql cursor queries? Would I get a performance benefit if I parallelize the first scheme(one with where in clause)?
- How many objects can I specify in where in clause? Is it only limited by the size of the query string?
Any other suggestions are also welcome.
Edit1: I have been given different solutions, but I would still like to know the answers to my original questions(out of curiousity).