views:

291

answers:

5

Now we have a firebird database with 1.000.000 that must be processed after ALL are loaded in RAM memory. To get all of those we must extract data using (select * first 1000 ...) for 8 hours. What is the solution for this?

A: 

What sort of processing do you need to do that would require to load them in memory and not just process them via SQL statements?

There are two techniques I use that work depending on what I am trying to do.

  1. Assuming there is some sort of artificial key (identity), work in batches, incrementing the last identity value processed.

  2. BCP the data out to a text file, churn through the updates, then BCP it back in, remembering to turn off constraints and indexes before the IN step.

Bill
We make some analysis of every from db, and we need to have them loaded in memory. The bottleneck is the moment when we transfer data from record set to our DTO objects.
Leonard P.
A: 

Take a look at this: http://www.firebirdfaq.org/faq13/

sza
Nothing to help me at that link, maybe this problem is encountered on all databases.I need to optimize the time for "while (read.Read())".
Leonard P.
+4  A: 

Does each of your "select * first 1000" (as you described it) do a full table scan? Look at those queries, and make sure they are using an index.

aaaa bbbb
Not the select speed is the problem. The transfer of data from select recordset to memory, the while (read.Read()) takes to long ...
Leonard P.
+1  A: 

to load data from a table with 1.000.000 rows in C# using a firebird db takes on a Pentium 4 3Ghz at least 8 hours

Everybody's been assuming you were running a SQL query to select the records from the database Something like

select * 
from your_big_table
/

Because that really would take a few seconds. Well, a little longer to display it on a screen, but executing the actual select should be lightning fast.

But that reference to C# makes me think you're doing something else. Perhaps what you really have is an RBAR loop instantiating one million objects. I can see how that might take a little longer. But even so, eight hours? Where does the time go?

edit

My guess was right and you are instantiating 1000000 objects in a loop. The correct advice would be to find some other way of doing whatever it is you do once you have got all your objects in memory. Without knowing more about the details it is hard to give specifics. But it seems unlikely this is a UI think - what user is going to peruse a million objects?

So a general observation will have to suffice: use bulk operations to implement bulk activity. SQL databases excel at handling sets. Leverage the power of SQL to process your million rows in a single set, rather than as individual rows.

If you don't find this answer helpful then you need to give us more details regarding want you're trying to achieve.

APC
Our process is run once a month max in the worst case. And yes we load all data from out table in a big single loop. That loop is our bottleneck. For a moment I was thinking to make some clones of the DB and now to run some parallel queries to load data faster.
Leonard P.
WEll you never load records in a loop when you have that many. That's database 101. I don;t know firebird but most database support using the select in the insert like insert table2 (field1, field2)select field1, field2 from table1That would be much faster than one row at a time
HLGEM
+1  A: 

How long does it take to construct the DTO object that you are creating with each data read?

{ int a = read.GetInt32(0); int b = read.GetInt32(1); mylist.Add(new DTO(a,b)); }

You are creating a million of these objects. If it takes 29 milliseconds to create one DTO object, then that is going to take over 8 hours to complete.

300 baud