How do i parse a text file in c#?
If you have more than a trivial language, use a parser generator. It drove me nuts but I've heard good things about ANTLR (Note: get the manual and read it before you start. If you have used a parser generator other than it before you will not approach it correctly right off the bat, at least I didn't)
Other tools also exist.
Without really knowing what sort of text file you're on about, its hard to answer. However, the FileHelpers library has a broad set of tools to help with fixed length file formats, multirecord, delimited etc.
What do you mean by parse? Parse usually means to split the input into tokens, which you might do if you're trying to implement a programming language. If you're just wanting to read the contents of a text file, look at System.IO.FileInfo.
The algorithm might look like this:
- Open Text File
- For every line in the file:
- Parse Line
There are several approaches to parsing a line.
The easiest from a beginner standpoint is to use the String methods.
If you are up for more of a challenge, then you can use the System.Text.RegularExpression library to parse your text.
Check this interesting approach, Linq To Text Files, very nice, you only need a IEnumerable<string>
method, that yields every file.ReadLine()
, and you do the query.
Here is another article that better explains the same technique.
using (TextReader rdr = new StreamReader(fullFilePath))
{
string line;
while ((line = rdr.ReadLine()) != null)
{
// use line here
}
}
set the variable "fullFilePath" to the full path eg. C:\temp\myTextFile.txt
A small improvement on Pero's answer:
FileInfo txtFile = new FileInfo("c:\myfile.txt");
if(!txtFile.Exists) { // error handling }
using (TextReader rdr = txtFile.OpenText())
{
// use the text file as Pero suggested
}
The FileInfo class gives you the opportunity to "do stuff" with the file before you actually start reading from it. You can also pass it around between functions as a better abstraction of the file's location (rather than using the full path string). FileInfo canonicalizes the path so it's absolutely correct (e.g. turning / into \ where appropriate) and lets you extract extra data about the file -- parent directory, extension, name only, permissions, etc.
To begin with, make sure that you have the following namespaces:
using System.Data; using System.IO; using System.Text.RegularExpressions;
Next, we build a function that parses any CSV input string into a DataTable:
public DataTable ParseCSV(string inputString) {
DataTable dt=new DataTable();
// declare the Regular Expression that will match versus the input string Regex re=new Regex("((?[^\",\r\n]+)|\"(?([^\"]|\"\")+)\")(,|(?\r\n|\n|$))");
ArrayList colArray=new ArrayList(); ArrayList rowArray=new ArrayList();
int colCount=0; int maxColCount=0; string rowbreak=""; string field="";
MatchCollection mc=re.Matches(inputString);
foreach(Match m in mc) {
// retrieve the field and replace two double-quotes with a single double-quote
field=m.Result("${field}").Replace("\"\"","\"");
rowbreak=m.Result("${rowbreak}");
if (field.Length > 0) {
colArray.Add(field);
colCount++;
}
if (rowbreak.Length > 0) {
// add the column array to the row Array List
rowArray.Add(colArray.ToArray());
// create a new Array List to hold the field values
colArray=new ArrayList();
if (colCount > maxColCount)
maxColCount=colCount;
colCount=0;
}
}
if (rowbreak.Length == 0) { // this is executed when the last line doesn't // end with a line break rowArray.Add(colArray.ToArray()); if (colCount > maxColCount) maxColCount=colCount; }
// create the columns for the table for(int i=0; i < maxColCount; i++) dt.Columns.Add(String.Format("col{0:000}",i));
// convert the row Array List into an Array object for easier access Array ra=rowArray.ToArray(); for(int i=0; i < ra.Length; i++) {
// create a new DataRow
DataRow dr=dt.NewRow();
// convert the column Array List into an Array object for easier access
Array ca=(Array)(ra.GetValue(i));
// add each field into the new DataRow
for(int j=0; j < ca.Length; j++)
dr[j]=ca.GetValue(j);
// add the new DataRow to the DataTable
dt.Rows.Add(dr);
}
// in case no data was parsed, create a single column if (dt.Columns.Count == 0) dt.Columns.Add("NoData");
return dt; }
Now that we have a parser for converting a string into a DataTable, all we need now is a function that will read the content from a CSV file and pass it to our ParseCSV function:
public DataTable ParseCSVFile(string path) {
string inputString="";
// check that the file exists before opening it if (File.Exists(path)) {
StreamReader sr = new StreamReader(path);
inputString = sr.ReadToEnd();
sr.Close();
}
return ParseCSV(inputString); }
And now you can easily fill a DataGrid with data coming off the CSV file:
protected System.Web.UI.WebControls.DataGrid DataGrid1;
private void Page_Load(object sender, System.EventArgs e) {
// call the parser DataTable dt=ParseCSVFile(Server.MapPath("./demo.csv"));
// bind the resulting DataTable to a DataGrid Web Control DataGrid1.DataSource=dt; DataGrid1.DataBind(); }
Congratulations! You are now able to parse CSV into a DataTable. Good luck with your programming.
Sanjay Manju Suman [email protected] www.wix.com/sanjaysumantera/sanju