tags:

views:

106

answers:

6

What is the BEST way to convert this :

FirstName,LastName,Title,BirthDate,HireDate,City,Region
Nancy,Davolio,Sales Representative,1948-12-08,1992-05-01,Seattle,WA
Andrew,Fuller,Vice President Sales,1952-02-19,1992-08-14,Tacoma,WA
Janet,Leverling,Sales Representative,1963-08-30,1992-04-01,Kirkland,WA
Margaret,Peacock,Sales Representative,1937-09-19,1993-05-03,Redmond,WA
Steven,Buchanan,Sales Manager,1955-03-04,1993-10-17,London,NULL
Michael,Suyama,Sales Representative,1963-07-02,1993-10-17,London,NULL
Robert,King,Sales Representative,1960-05-29,1994-01-02,London,NULL
Laura,Callahan,Inside Sales Coordinator,1958-01-09,1994-03-05,Seattle,WA
Anne,Dodsworth,Sales Representative,1966-01-27,1994-11-15,London,NULL

to this :

FirstName  LastName             Title                          BirthDate   HireDate   City            Region
---------- -------------------- ------------------------------ ----------- ---------- --------------- ---------------
Nancy      Davolio              Sales Representative           1948-12-08  1992-05-01  Seattle         WA
Andrew     Fuller               Vice President, Sales          1952-02-19  1992-08-14  Tacoma          WA
Janet      Leverling            Sales Representative           1963-08-30  1992-04-01  Kirkland        WA
Margaret   Peacock              Sales Representative           1937-09-19  1993-05-03  Redmond         WA
Steven     Buchanan             Sales Manager                  1955-03-04  1993-10-17  London          NULL
Michael    Suyama               Sales Representative           1963-07-02  1993-10-17  London          NULL
Robert     King                 Sales Representative           1960-05-29  1994-01-02  London          NULL
Laura      Callahan             Inside Sales Coordinator       1958-01-09  1994-03-05  Seattle         WA
Anne       Dodsworth            Sales Representative           1966-01-27  1994-11-15  London          NULL
A: 

May I suggest you take a look at String.Split(Char[]). The rest should be trivial.

500 - Internal Server Error
I'm unconvinced the rest is trivial.
Sam Pearson
Here, I'm just trying to provide the missing piece, not to do the whole puzzle :)
500 - Internal Server Error
+3  A: 

I'd create a custom class to hold the information, then do a loop for each line in the CSV file, split on the comma and fill your custom object up. Then throw all of them into a list or IEnumrable and throw it into a repeater / datagrid.

public class Person
    {
        public string FirstName { get; set; }
        public string LastName { get; set; }
        public string Title { get; set; }
        public DateTime BirthDate { get; set; }
        public DateTime HireDate { get; set; }
        public string City { get; set; }
        public string Region { get; set; }
    }

    public void Parse(string csv)
    {
        string[] lines = csv.Split( Environment.NewLine.ToCharArray() );
                    List<Person> persons = new List<Person>();

        foreach (string line in lines)
        {
            string[] values = line.Split( ',' );

            Person p = new Person();

            p.FirstName = values[ 0 ];
            p.LastName = values[ 1 ];

                            persons.Add( p );
            //.... etc etc
        }
    }
Matt
+1, This is a good approach to start with, especially if you need to validate the source file and catch possible data errors. (But I would have `Parse` method in `Person` class responsible for string splitting and parsing logic)
Regent
i am not asking how to parse delimited values to an object :)
ehosca
@ehosca - Once in the object though you can make a ToString() method to print out the entire line though. Write a simple foreach loop to handle the printing.
Matt
how does your code handle parsing the Title for Mr Fuller ? :)
ehosca
@ehosca - Really it all depends on how much work you'd like to do. Are you building an import process by chance? I've done import functions for Outlook / Outlook Express before, and they both export their contact list differently (annoying!). So Outlook would supply the title of the person, but for Outlook, the title would be supplied. My solution was to do a small regex check to test the first name and, if positive, extract it out. I even had imports where two people combined in the name (John and Kerry Scott). It's no easy task but not impossible. Good learning experience / challenge!
Matt
You've got a self-describing data format (columns are named on the first line) but you're hard-coding the fields in your data class. Ouch.
Jay Bazuzi
+3  A: 

This meets your requirements as stated, and uses LINQ (since your question was tagged LINQ), but is not necessarily best:

class Program
{
    static void Main(string[] args)
    {
        List<string> inputs = new List<string>
        {
            "FirstName,LastName,Title,BirthDate,HireDate,City,Region",
            "Nancy,Davolio,Sales Representative,1948-12-08,1992-05-01,Seattle,WA",
            "Andrew,Fuller,Vice President Sales,1952-02-19,1992-08-14,Tacoma,WA",
            "Janet,Leverling,Sales Representative,1963-08-30,1992-04-01,Kirkland,WA",
            "Margaret,Peacock,Sales Representative,1937-09-19,1993-05-03,Redmond,WA",
            "Steven,Buchanan,Sales Manager,1955-03-04,1993-10-17,London,NULL",
            "Michael,Suyama,Sales Representative,1963-07-02,1993-10-17,London,NULL",
            "Robert,King,Sales Representative,1960-05-29,1994-01-02,London,NULL",
            "Laura,Callahan,Inside Sales Coordinator,1958-01-09,1994-03-05,Seattle,WA",
            "Anne,Dodsworth,Sales Representative,1966-01-27,1994-11-15,London,NULL"
        };

        // TODO: These widths would presumably be configurable
        List<int> widths = new List<int> { 12, 22, 32, 13, 12, 17, 8 };

        List<string> outputs = inputs.Select(s => ToFixedWidths(s, ',', widths)).ToList();

        outputs.ForEach(s => System.Diagnostics.Debug.WriteLine(s));

        Console.ReadLine();
    }

    private static string ToFixedWidths(string s, char separator, List<int> widths)
    {
        List<string> split = s.Split(separator).ToList();

        // TODO: Error handling - what if there are more/less separators in
        // string s than we have width values?

        return string.Join(String.Empty, split.Select((ss, i) => ss.PadRight(widths[i], ' ')).ToArray());
    }
}

In a production scenario though I'd expect to see this data read into an appropriate Person class, as Matt recommended in his answer.

Richard Ev
Why don't you use `ss.PadRight(widths[i], ' ')` in your `ToFixedWidths` function?
Regent
A: 
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;

namespace StringParsingWithLinq
{
    internal class Program
    {
        private static void Main(string[] args)
        {
            var inputs = new List<string>
                             {
                                 "FirstName,LastName,Title,BirthDate,HireDate,City,Region",
                                 "Nancy,Davolio,Sales Representative,1948-12-08,1992-05-01,Seattle,WA",
                                 "Andrew,Fuller,\"Vice President, Sales\",1952-02-19,1992-08-14,Tacoma,WA",
                                 "Janet,Leverling,Sales Representative,1963-08-30,1992-04-01,Kirkland,WA",
                                 "Margaret,Peacock,Sales Representative,1937-09-19,1993-05-03,Redmond,WA",
                                 "Steven,Buchanan,Sales Manager,1955-03-04,1993-10-17,London,NULL",
                                 "Michael,Suyama,Sales Representative,1963-07-02,1993-10-17,London,NULL",
                                 "Robert,King,Sales Representative,1960-05-29,1994-01-02,London,NULL",
                                 "Laura,Callahan,Inside Sales Coordinator,1958-01-09,1994-03-05,Seattle,WA",
                                 "Anne,Dodsworth,Sales Representative,1966-01-27,1994-11-15,London,NULL"
                             };

            Console.Write(FixedWidthHelper.ReadLines(inputs)
                              .ToFixedLengthString());
            Console.ReadLine();
        }

        #region Nested type: FixedWidthHelper

        public class FixedWidthHelper
        {
            private readonly Regex _csvRegex = new Regex(",(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))");
            private readonly List<string[]> _data = new List<string[]>();
            private List<int> _fieldLen;

            public static FixedWidthHelper ReadLines(List<string> lines)
            {
                var fw = new FixedWidthHelper();
                lines.ForEach(fw.AddDelimitedLine);
                return fw;
            }

            private void AddDelimitedLine(string line)
            {
                string[] fields = _csvRegex.Split(line);

                if (_fieldLen == null)
                    _fieldLen = new List<int>(fields.Select(f => f.Length));

                for (int i = 0; i < fields.Length; i++)
                {
                    if (fields[i].Length > _fieldLen[i])
                        _fieldLen[i] = fields[i].Length;
                }

                _data.Add(fields);
            }

            public string ToFixedLengthString()
            {
                var sb = new StringBuilder();
                foreach (var list in _data)
                {
                    for (int i = 0; i < list.Length; i++)
                    {
                        sb.Append(list[i].PadRight(_fieldLen[i] + 1, ' '));
                    }
                    sb.AppendLine();
                }

                return sb.ToString();
            }
        }

        #endregion
    }
}

alt text

ehosca
A: 
Jay Bazuzi
i would like to see the implementation of the -AutoSize option :)
ehosca
+1  A: 

You have two problems here. Consider them separately and you will find a good solution more easily.

  1. Parse your CSV-format input data in to a useful format.

  2. Present your data in a certain way

Don't write your own CSV parser. The rules are a little tricky, but the format is well-known. Getting it wrong would be bad in the long run. There are existing CSV libraries in the .NET framework you could call on, but I don't know much about them. However, this problem is perfect for the new dynamic feature in C#. Here's one that looks promising: http://tonikielo.blogspot.com/2010/01/c-40-dynamic-linq-to-csvhe.html

I'm assuming that printing the data is a trivial problem and you don't need our help. If not, you'll need to give us some more information, like how you want to decide the widths of the columns.

Jay Bazuzi
i agree with all your points. dynamic looks very promising indeed. i'll definitely take that into account when we get the green light for 4.0i came across this problem as i was writing a SQL Query Analyzer type tool that instead of a database allows you to write OQL queries against a GemFire distributed cache and display the results. Query Analyzer allows the output to be displayed in multiple views, in a grid, as text and so on. I guess did didn't do a good enough job formulating the question.the rule for determining the column width is : widest width of any single item in the column + 1
ehosca