views:

631

answers:

12

I need to load one column of strings from table on SqlServer into Array in memory using C#. Is there a faster way than open SqlDataReader and loop through it. Table is large and time is critical.

EDIT I am trying to build .dll and use it on server for some operations on database. But it is to slow for now. If this is fastest than I have to redesign the database. I tough there may be some solution how to speed thing up.

+6  A: 

I suspect that SqlDataReader is about as good as you're going to get.

LukeH
Ha! Would either of the downvoters care to elaborate on what's wrong with this answer?
LukeH
+1  A: 

i don't think so.

Arthur Rizzo
+8  A: 

No. It is actually not only the fastest way - it is the ONLY (!) way. All other mechanisms INTERNALLY use a DataReader anyway.

TomTom
+14  A: 

If SqlDataReader isn't fast enough, perhaps you should store your stuff somewhere else, such as an (in-memory) cache.

Steven
Agreed - can you load the data in advance and iterate from a collection in memory?
tomfanning
+1 - a caching layer or in memory database is the way to go.
Winston Smith
+1  A: 

What about transforming one column of rows to one row of columns, and having only one row to read? SqlDataReader has an optimization for reading a single row (System.Data.CommandBehavior.SingleRow argument of ExecuteReader), so maybe it can improve the speed a bit.

I see several advantages:

  • Single row improvement,
  • No need to access an array on each iteration (reader[0]),
  • Cloning an array (reader) to another one may be faster than looping through elements and adding each one to a new array.

On the other hand, it has a disadvantage to force SQL database to do more work.

MainMa
It sounds strange but as this .dll is used on server it seems that i get data fater by SqlDataReader than by building one row in SQL.
watbywbarif
+18  A: 

Data Reader

About the fastest access you will get to SQL is with the SqlDataReader.

Profile it

It's worth actually profiling where your performance issue is. Usually where you think the performance issue is, is proven to be totally wrong after you've profiled it.

For example it could be:

  1. The time... the query takes to run
  2. The time... the data takes to copy across the network/process boundry
  3. The time... .Net takes to load the data into memory
  4. The time... your code takes to do something with it

Profiling each of these in isolation will give you a better idea of where your bottleneck is. For profiling your code, there is a great article from Microsoft

Cache it

The thing to look at to improve performance is to work out if you need to load all that data everytime. Can the list (or part of it) be cached? Take a look at the new System.Runtime.Caching namespace.

Rewrite as T-SQL

If you are doing purely data operations (as your question suggests) you could re-write your code which is using the data to be T-SQL and run natively on SQL, this has to potential to be much faster as you will be working with the data directly and not shifting it about.

If your code has a lot of nessecary procedural logic you try mixing T-SQL with CLR Integration giving you the benefits of both worlds.

This very much comes down to the complexity (or more procedural nature) of your logic.

If all else fails

If all areas are optimal (or as near as), and your design is without fault. I wouldn't even get into micro-optimisation, I'd just throw hardware at it.

What hardware? Try the reliability and performance monitor to find out where the bottle neck is. Most likely place for the problem you describe HDD or RAM.

badbod99
I have tested some thing, SqlDataReader is obviously faster than DataSet ;) Yes, loading time is hitting performance worst.
watbywbarif
And im am not sending to client, .dll is used on same machine as server for some internal usage.
watbywbarif
Edited to match you updated question.
badbod99
+1 for "rewrite as T-SQL". The ideal query is one that only retrieves absolutely necessary data. If you're retrieving 100k rows to the client app, then processing there, then perhaps you should re-consider your logic.
BradC
A well deserved +1.
Steven
A: 

"Provides a way of reading a forward-only stream of rows from a SQL Server database" This is the use of SqlDataReader from MSDN . The Data structure behind SqlDataReder only allow read forward, it's optimized for reading data in one direction. In my opinion, I want to use SqlDataReader than DataSet for simple data reading.

coolkid
+3  A: 

SqlDataReader is the fastest way. Make sure you use the get by ordinal methods rather than get by column name. e.g. GetString(1);

Also worthwhile is experimenting with MinPoolSize in the connection string so that there are always some connections in the pool.

Pratik
Can you explain more about MinPoolSize, I don't see how this should help?
watbywbarif
In .Net DB connections are returned to a connection pool after being closed and then eventually the underlying SQL server connection is closed after a period of inactivity. This generates the login and logout events. In certain scenario (infrequent web service calls) it may be beneficial to always have some ready connections in the pool to handle the first request quickly rather than having to open a new connection with the SQL server.
Pratik
+2  A: 

The SqlDataReader will be the fastest way. Optimize the use of it, by using the appropriate Getxxx method , which takes an ordinal as parameter.

If it is not fast enough, see if you can tweak your query. Put a covering index on the column (s) that you want to retrieve. By doing so, Sql Server only has to read the index, and does not have to go to the table directly to retrieve all the info that is required.

Frederik Gheysels
Query is only one column select, there is no place for optimization there, only to redesign the database ;(
watbywbarif
@watbywbarif an index will still help even on a single column select
msarchet
I checked, it is already Indexed.
watbywbarif
Have you created an index that ONLY contains the single column you are selecting on?
Ian Ringrose
yes, i have this from start
watbywbarif
A: 

If responsiveness is an issue loading a great deal of data, look at using the asynchronous methods - BeginReader.

I use this all the time for populating large GUI elements in the background while the app continues to be responsive.

You haven't said exactly how large this data is, or why you are loading it all into an array.

Often times, for large amounts of data, you may want to leave it in the database or let the database do the heavy lifting. But we'd need to know what kind of processing you are doing that needs it all in an array at one time.

Cade Roux
Responsiveness is not problem.
watbywbarif
+1  A: 

You have 4 sets of overheads - Disk Access - .net code (cpu) - SQL server code (cpu) - Time to switch between managed and unmanaged code (cpu)

Firstly is

select * where column = “junk” 

fast enough for you, if not the only solution is to make the disk faster. (You can get data from SQL Server faster than it can read it)

You may be able to define a Sql Server function in C# then run the function over the column; sorry I don’t know how to do it. This may be faster than a data reader.

If you have more than one CPU, and you know a value the middle of the table, you could try using more than one thread.

You may be able to write some TSQL that combines all the strings into a single string using a separator you know is safe. Then split the string up again in C#. This will reduce the number of round trips between managed and unmanaged code.

Ian Ringrose
I don't know how would more threads speed up loading data as SqlDataReader reads sequential?
watbywbarif
Each thread could use it's own SqlDataReader, provided you can something you can be in the "where" to define the data between the threads.
Ian Ringrose
Nice idea, maybe it can help. +1
watbywbarif
+1  A: 

Some surface-level things to consider that may affect speed (besides a data-reader):

  1. Database Query Optimization
    • OrderBy is expensive
    • Distinct is expensive
    • RowCount is expensive
    • GroupBy is expnsive
    • etc. Sometimes you can't live without these things, but if you can handle some of these things in your C# code instead, it may be faster.
  2. Database Table indexing (for starters, are the fields in your WHERE clause indexed?)
  3. Database Table DataTypes (are you using the smallest possible, given the data?)
  4. Why are you converting the datareader to an array?
    • e.g., would it serve just as well to create an adapter/datatable that you then would not need to convert to an array?
  5. Have you looked into Entity Framework? (might be slower...but if you're out of options, might be worthwhile to look into just to make sure)

Just random thoughts. Not sure what might help in your situation.

おたく