tags:

views:

170

answers:

6

I have a text file that looks like this:

1,Smith, 249.24, 6/10/2010
2,Johnson, 1332.23, 6/11/2010
3,Woods, 2214.22, 6/11/2010
1,Smith, 219.24, 6/11/2010

I need to be able to find the balance for a client on a given date.

I'm wondering if I should:

A. Start from the end and read each line into an Array, one at a time. Check the last name index to see if it is the client we're looking for. Then, display the balance index of the first match.

or

B. Use RegEx to find a match and display it.

I don't have much experience with RegEx, but I'll learn it if it's a no brainer in a situation like this.

+1  A: 

If you're just reading it I'd consider reading in the whole file in memory using StreamReader.ReadToEnd and then treating it as one long string to search through and when you find a record you want to look at just look for the previous and next line break and then you have the transaction row you want.

If it's on a server or the file can be refreshed all the time this might not be a good solution though.

ho1
+2  A: 

I think the cleanest way is to load the entire file into an array of custom objects and work with that. For 3 MB of data, this won't be a problem. If you wanted to do completely different search later, you could reuse most of the code. I would do it this way:

class Record
{
  public int Id { get; protected set; }
  public string Name { get; protected set; }
  public decimal Balance { get; protected set; }
  public DateTime Date { get; protected set; }

  public Record (int id, string name, decimal balance, DateTime date)
  {
    Id = id;
    Name = name;
    Balance = balance;
    Date = date;
  }
}

…

Record[] records = from line in File.ReadAllLines(filename)
                   let fields = line.Split(',')
                   select new Record(
                     int.Parse(fields[0]),
                     fields[1],
                     decimal.Parse(fields[2]),
                     DateTime.Parse(fields[3])
                   ).ToArray();

Record wantedRecord = records.Single(r => r.Name = clientName && r.Date = givenDate);
svick
+2  A: 

This looks like a pretty standard CSV type layout, which is easy enough to process. You can actually do it with ADO.Net and the Jet provider, but I think it is probably easier in the long run to process it yourself.

So first off, you want to process the actual text data. I assume it is reasonable to assume each record is seperated by some newline character, so you can utilize the ReadLine method to easily get each record:

StreamReader reader = new StreamReader("C:\Path\To\file.txt")
while(true)
{
    var line = reader.ReadLine();
    if(string.IsNullOrEmpty(line))
        break;
    // Process Line
}

And then to process each line, you can split the string on comma, and store the values into a data structure. So if you use a data structure like this:

public class MyData
{
    public int Id { get; set; }
    public string Name { get; set; }
    public decimal Balance { get; set; }
    public DateTime Date { get; set; }
}

And you can process the line data with a method like this:

public MyData GetRecord(string line)
{
    var fields = line.Split(',');
    return new MyData()
    {
        Id = int.Parse(fields[0]),
        Name = fields[1],
        Balance = decimal.Parse(fields[2]),
        Date = DateTime.Parse(fields[3])
    };
}

Now, this is the simplest example, and doesn't account for cases where the fields may be empty, in which case you would either need to support NULL for those fields (using nullable types int?, decimal? and DateTime?), or define some default value that would be assigned to those values.

So once you have that you can store the collection of MyData objects in a list, and easily perform calculations based on that. So given your example of finding the balance on a given date you could do something like:

var data = customerDataList.First(d => d.Name == customerNameImLookingFor 
                                    && d.Date == dateImLookingFor);

Where customerDataList is the collection of MyData objects read from the file, customerNameImLookingFor is a variable containing the customer's name, and customerDateImLookingFor is a variable containing the date.

I've used this technique to process data in text files in the past for files ranging from a couple records, to tens of thousands of records, and it works pretty well.

ckramer
CSV has some tricky details (particularly when handling characters that are also metacharacters) so you're better off not writing your own parser. “Easier”? Hardly. “Easier to do badly and then come a cropper in production” is more likely.
Donal Fellows
It all depends on what your input is, and what level of control you have. The trickiest format I've ever run into in the real world is a case when there was a comma embedded in the field, in which case there are double quotes around the field, which is again easy to handle. If you are in a situation where you are getting data in a variety of formats, then you may be better off finding a CSV parser that does what you need. I've found few situations where the ADO.Net Jet provider didn't end up being more brittle and more error prone than doing a simple parse myself.
ckramer
I disagree with this answer, this isn't the right approach.
Pierreten
+1  A: 

If it's all well-formatted CSV like this then I'd use something like the Microsoft.VisualBasic.TextFieldParser class or the Fast CSV class over on code project to read it all in.

The data type is a little tricky because I imagine not every client has a record for every day. That means you can't just have a nested dictionary for your looksup. Instead, you want to "index" by name first and then date, but the form of the date record is a little different. I think I'd go for something like this as I read in each record:

Dictionary<string, SortedList<DateTime, double>>
Joel Coehoorn
+1  A: 

hey, hey, hey!!! why not do it with this great project on codeproject Linq to CSV, way cool! rock solid

almog.ori
+1  A: 

I would recommend using the FileHelpers opensource project: http://filehelpers.sourceforge.net/

Piece of cake:

Define your class:

[DelimitedRecord(",")]
public class Customer
{
    public int CustId;

    public string Name;

    public decimal Balance;

    [FieldConverter(ConverterKind.Date, "dd-MM-yyyy")]
    public DateTime AddedDate;

}   

Use it:

FileHelperAsyncEngine engine = new FileHelperAsyncEngine(typeof(Customer));

// Read
engine.BeginReadFile("TestIn.txt");

// The engine is IEnumerable 
foreach(Customer cust in engine)
{
   // your code here
   Console.WriteLine(cust.Name);

   // your condition >> add balance
}

engine.Close();
bertelmonster2k