tags:

views:

139

answers:

6

You need to pull a large amount of data (thousands of entries) from a local SQL server to be used in generating a report. You have two options:

1) Use a single query to get the data from the server, load it into a .NET data table and use a series of loops to get the information you need

2) Perform an SQL query each time you need a record

I know there are several factors affecting speed but with only the information given, which would you think is faster?

+1  A: 

If there are only a few thousand of entries, scanning the table linearly will probably not be slower than the overhead to get data from the server. However, the approach scales really badly and if the number of rows increases, your solution will become very slow quickly, wherease the solution using a query is more 'predictable'. In general, let the database perform what its meant to be: process data.

inflagranti
+1  A: 

1 - Batching requests is almost always better. The network and connection overhead by opening a thousand queries will add up.

StingyJack
+2  A: 

Without any more information, I was say option 1. You can perform LOTS of CPU operations in the time it takes to make a round trip to a database and download the result set. Also, I assume it would be easier to code the C# logic than the T-SQL logic. However, if you are talking about set queries then maybe the SQL would be easier to write. In the end, I would do whatever is easier to write, then later change it if it is too slow.

BTW, how many queries are we talking here? 5, 10, 100? If it is 5 queries, then the time won't be bad, but if you are doing a query for each record, and you can have lots of records, you never want to do that.

tster
A: 

I would expect 1 to be faster, if there's little enough data that it fits in RAM, and if the 'series of loops' on whatever your data structures are is no less efficient than whatever SQL server might achieve using its indexes.

ChrisW
If you have 4 GB of memory and have a table with 100 million records it might fit in memory, but doing loops over it will take a long time. Where the server might have an index and can go straight to the data you need.
tster
@tster - That's why I mentioned indexes: for example, if the SQL server has an index on some data then to mimic that's performance in your own code/RAM/collection you might store the data in an O(1) dictionary instead of in an O(n) list or array.
ChrisW
Implementing in memory multiple indexes over the same data is difficult and prone to errors. Might as well allow the database to handle it in that case.
tster
It's easy to have multiple indexes (just have the same data in multiple dictionaries). I'd typically agree with preferring a database though (though not necessarily with using it to select one record at a time): there's less point in having a database at all if you're not going to use it.
ChrisW
A: 

I think other things factor into this decision such as:

  • Do you have caching setup on SQL?
  • Is SQL on the same server or remote machine? (Do you have to think about network overhead/speed issues)
  • Is memory an issue or do you have free reign on the server? (Does the web server have enough memory to hold lots o' data in a data table?)
  • How many people/programs are accessing SQL (More people might equal slower response times)
  • Is it a mostly read or write database (Writes tend to lock tables, may have to wait to get your data, a one time bulk read would be faster here than several smaller reads)

For example, cached data on a local SQL server will probably be just as fast as loading everything into a data table. However, data on a remote machine (cached or uncached) will be magnitudes slower given the connection overhead/network latency/having to read the records from the hard drive.

Tommy
A: 

I was dealing with some similar problem... my solution was normalizing data for each report... the bad news with that is that you need a background process updating your data, but with that I don't need to process a lot of info on memory and the database connection isn't so much recall..

a52