views:

220

answers:

3

Hello all

I'm stuck in a huge problem where i need to handle huge data.

I get a data set which has 5 datatables. One of the tables has around 3000 columns and 50,000 records.

I need to save this dataset completely in the SQL database as XML and i need to retrieve it later.

I CAN'T make any design changes to skip it. When i do dataset.GetXml() it returns a string of huge length which throws OutOfMemoryException. I knew that the string datatype has a certain limit to carry data.

But in sql table the xml column can hold upto 2 Gb. So how can i move this dataset back and forth between the database and my application?

Thank you

NLV

+3  A: 

Your design sounds flawed. Why do you need to retrieve the entire table into memory? Page through it instead - operate on the table in smaller batches - say, 1000 records at a time.

You probably don't even need to do that. Generally, database applications operate over just the data that's changing. If your app needs to modify 10 rows, then retrieve just those ten rows. You'll save not only memory, but all the time it takes to retrieve and post that data.

3000 columns in the table? That also smells (really) bad. Look into normalizing your database.

I know it feels easier to fix the immediate problem and move on. You'll have many, many fewer problems in the long run, however, if you fix the (frankly quite severe) design problems up front.

Michael Petrotta
3000 columns! I got this huge twinge in my neck and back when I read that! Good grief man!
Buggabill
I can understand all your comments. I have a main datagrid in my application which gets a dataset binded to it. So the sql table is not havng 3000 columns. Most of them are dynamic columns. Also each and every row has lot of columns that are user editable. My goal is to persist the changes he did when he reopens the application again. It is a vague application with 3000 columns in the grid. I dono what the client is going to do having all of them at one place. But thats the requirement!
NLV
That's even worse. How is a user supposed to handle 3000 columns in a single grid? If you have any sway at all, push to change the requirement. Use this problem as a lever, because if you need to be fully disconnected from the database while presenting all this data, you're likely to have memory problems no matter what you do.
Michael Petrotta
I've managed to split the whole dataset into multiple sub tables and i've stored it in 30 xml columns in the database. The thing is that i'm not able to convert those xml into datatable. I used xmldatadocument to get the xml and store it in the database. But i'm not able to re-read that xml as datatable. I'm getting "Root element missing", "Invalid characters" etc while trying to convert that. Any ideas?
NLV
Post your XML and the code used to read it. This probably warrants a separate question. Don't post in a comment - code and XML formatting doesn't work well here.
Michael Petrotta
+1  A: 

You will need to switch to all the APIs that involve streaming, use SqlDataReader (1 row at a time), not DataSet. Also, when possible, stream you data to temporary files on the file system instead of holding it in memory.

If 1 row is too much to hold, then likely you need to use the streaming APIs to talk to the blobs in the relevant large columns.

MatthewMartin
+1  A: 

3k columns x 50k rows = 150 million elements before you start. How many characters per element?

I have to ask... why are you using a database to store this? You may as well store it on a fileserver.

50k rows is tiny, but 3k columns is extreme, and it's all hugely bloated in XML

gbn
I cant store it in a local file as it is a multi user application.
NLV
I think you meant to say "3k columns is extreme" - hope you don't mind my fixing that. :)
Aaronaught
@NLV: because of the time you'll need to process this data in a DB engine, it's single user anyway...
gbn