How can I load a csv file into a System.Data.DataTable, creating the datatable based on the CSV file?
Is there a class library for this or can I use ADO.net to connect to the file?
How can I load a csv file into a System.Data.DataTable, creating the datatable based on the CSV file?
Is there a class library for this or can I use ADO.net to connect to the file?
You need to parse the file line by line and pull out every column value.
CSV is tricky because you cannot just split the line on comma's. Some columns have comma's in the data like: "Overby, Ronnie".
Here's an excellent class that will copy CSV data into a datatable using the structure of the data to create the DataTable:
A portable and efficient generic parser for flat files
It's easy to configure and easy to use. I urge you to take a look.
Here's a solution that uses ADO.Net's ODBC text driver:
Dim csvFileFolder As String = "C:\YourFileFolder"
Dim csvFileName As String = "YourFile.csv"
'Note that the folder is specified in the connection string,
'not the file. That's specified in the SELECT query, later.
Dim connString As String = "Driver={Microsoft Text Driver (*.txt; *.csv)};Dbq=" _
& csvFileFolder & ";Extended Properties=""Text;HDR=No;FMT=Delimited"""
Dim conn As New Odbc.OdbcConnection(connString)
'Open a data adapter, specifying the file name to load
Dim da As New Odbc.OdbcDataAdapter("SELECT * FROM [" & csvFileName & "]", conn)
'Then fill a data table, which can be bound to a grid
Dim dt As New DataTableda.Fill(dt)
grdCSVData.DataSource = dt
Once filled, you can value properties of the datatable, like ColumnName, to make utilize all the powers of the ADO.Net data objects.
In VS2008 you can use Linq to achieve the same effect.
NOTE: This may be a duplicate of this SO question.
I have been using OleDb provider. However it has problems if you are reading in rows that have look like numeric values but you want them treated as text. However you can get around that issue by creating a schema.ini file. Here is my methods I used.
public static DataTable GetCSVRows(string path, bool IsFirstRowHeader)
{
string header = "No";
string sql = string.Empty;
DataTable dataTable = null;
string pathOnly = string.Empty;
string fileName = string.Empty;
try
{
pathOnly = Path.GetDirectoryName(path);
fileName = Path.GetFileName(path);
sql = @"SELECT * FROM [" + fileName + "]";
if(IsFirstRowHeader)
{
header = "Yes";
}
using(OleDbConnection connection = new OleDbConnection(
@"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + pathOnly +
";Extended Properties=\"Text;HDR=" + header + "\""))
{
using(OleDbCommand command = new OleDbCommand(sql, connection))
{
using(OleDbDataAdapter adapter = new OleDbDataAdapter(command))
{
dataTable = new DataTable();
dataTable.Locale = CultureInfo.CurrentCulture;
adapter.Fill(dataTable);
}
}
}
}
finally
{
}
return dataTable;
}
I have decided to use Sebastien Lorion's Csv Reader.
Jay Riggs suggestion is a great solution also, but I just didn't need all of the features that that Andrew Rissing's Generic Parser provides.
After using Sebastien Lorion's Csv Reader in my project for nearly a year and a half, I have found that it throws exceptions when parsing some csv files that I believe to be well formed.
So, I did switch to Andrew Rissing's Generic Parser and it seems to be doing much better.
We always used to use the Jet.OLEDB driver, until we started going to 64 bit applications. Microsoft has not and will not release a 64 bit Jet driver. Here's a simple solution we came up with that uses File.ReadAllLines and String.Split to read and parse the CSV file and manually load a DataTable. As noted above, it DOES NOT handle the situation where one of the column values contains a comma. We use this mostly for reading custom configuration files - the nice part about using CSV files is that we can edit them in Excel.
string CSVFilePathName = @"C:\test.csv";
string[] Lines = File.ReadAllLines(CSVFilePathName);
string[] Fields;
Fields = Lines[0].Split(new char[] { ',' });
int Cols = Fields.GetLength(0);
DataTable dt = new DataTable();
//1st row must be column names; force lower case to ensure matching later on.
for (int i = 0; i < Cols; i++)
dt.Columns.Add(Fields[i].ToLower(), typeof(string));
DataRow Row;
for (int i = 1; i < Lines.GetLength(0); i++)
{
Fields = Lines[i].Split(new char[] { ',' });
Row = dt.NewRow();
for (int f = 0; f < Cols; f++)
Row[f] = Fields[f];
dt.Rows.Add(Row);
}