views:

385

answers:

2

I am developing using C# 3 and .NET 3.5 and have an object where the contents of a CSV file (of an unknown format) are held in a byte[] property. I need to parse this byte[] and create a System.DataTable containing the columns and rows.

The problem is creating a System.DataTable where column data types match the data in the CSV.

I am currently using CsvReader to parse the byte[] CSV as this works well with streams and is very fast. Unfortunately this simply treats all data as strings.

I have tried using Jet/OLEDB to read from a CSV file and this successfully creates a DataTable with columns of different data types - however as this requires the use of a connection string I assume it can't be used to parse a byte[] in memory.

Is there a way this can be done, and if not what would be a reasonable way to deduce the data types and convert the untyped DataTable into a DataTable with specific column data types?

A: 

You could always look at FileHelpers library: CsvToDataTable or ReadStreamAsDT, for which you'd first need to get your byte array into a TextReader via a chain of streams (via MemoryStream perhaps).

Or, the DIY method:

Have you tried using a DataReader?

see: DataReader Constructor (Stream, Encoding, Boolean)

Presumably you could wrap a MemoryStream around your byte array.

Then I'd guess you can probably feed it to a DataAdapter which could spit out a DataSet from which you'll get a single DataTable...

Edit

I realise now that you probably already considered using the various components of ADO.Net since you mentioned Jet/ODBC.

I also realise that my URL for DataReader is from Biztalk.

I was thinking of the IDbDataReader interface but, of course, you need a particular concrete implementation for which there is probably none that fits your requirements other than perhaps this one within Biztalk (which you probably don't have access to).

Maybe implement your own? (not a serious suggestion BTW.)

rohancragg
Last time I looked at FileHelpers the format of the CSV file has to be known beforehand to create objects which can be mapped to. In this case the CSV file could have any format.If use a DataReader when will the column data types of the CSV be inferred?Thanks for the suggestions.
TonE
I'm afraid I've not tried so I don't know when they'll be inferred, sorry.
rohancragg
+2  A: 

I guess if Jet/OLEDB can do it for a file, you could create a memory-mapped file and feed it to Jet. However, I would much prefer to do it myself in three simple steps

1) read the first N lines with CsvReader and deduce the data types (if all N look like integers, field is integer, etc)

2) create DataTable with appropriate structure

3) restart reading and fill it.

yu_sha
Thanks, I thought I might have to take the DIY route. Was thinking of using TryParse on various data types to deduce the type, although this feels a bit like reinventing - sound reasonable?
TonE
TryParse sounds good. For dates you might want to do TryParseExact too.
yu_sha