tags:

views:

203

answers:

8

I have this weird situation where I have to read horizontally. So I am getting a csv file which has data in horizontal format. Like below:

CompanyName,RunDate,10/27/2010,11/12/2010,11/27/2010,12/13/2010,12/27/2010....

All the dates shown after RunDate are values for run date field and I have to update that field for that company in my system. The date values are not fix number, they can be single value to 10 to n number. So I need to read all those values and update in the system. I am writing this in Java.

+5  A: 

split them by "," and parse it and ,Use List to add all these values.

As other has suggested for splitting and parsing you can use opencsv

org.life.java
-1 this will not handle fields with commas in the field, which is perfectly valid CSV. Splitting on "," works in a simplistic case, but only occasionally in a realistic one.
Dave DeLong
@Dave DeLong can you elaborate your comment
org.life.java
@org.life.java Consider this csv line: `"Hello,",my,name,is,Dave`. It has 5 fields: `Hello,` and `my` and `name` and `is` and `Dave`. Your suggestion would yield 6: `"Hello`, `"`, `my`, `name`, `is`, and `Dave`
Dave DeLong
@Dave Delong, Yeah thats true, but I am now suggesting to blindly write code for each of three statement. I have just given basic idea to him/her
org.life.java
+1  A: 

You start by reading the entire line into a String. Then you use the String.split(...) function to get all the tokens on the line where the delimiter you use is ",". (or is it "\," when you use a regex?)

camickr
You can just call `String.split(",")`.
Christian Mann
Thanks, I'll try to remember that, I rarely use a regex.
camickr
+1  A: 

In order to get each value one at a time, use a StringTokenizer. Construct it with StringTokenizer(str, ","). (Not recommended)

Use the split() method of the string class, which loads all of the tokens into an array.

Use the DateFormat class to parse each date -- specifically DateFormat.parse(String).

Christian Mann
From the `StringTokenizer` api: StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead.
Qwerky
:embarrassed: I probably should look up the documentation before I recommend an answer. <Edits answer to reflect new knowledge>
Christian Mann
@Qwerky - I hate that they threw away a perfectly good class - but you're correct.
KevinDTimm
My guess is that it's not thread-safe, but I didn't research that at all.
Christian Mann
`StringTokenizer` is not "deprecated", it is just "not recommended". There is a big difference IMHO.
Grodriguez
@G - good point - and upon reading some comments online I can see why split is better. But, I still have to train my 10 years of java fingers to type String.split() instead of StringTokenizer(). Not so easy to do.
KevinDTimm
All the kids nowadays tend to use `Scanner`. I guess Sun lamented the loss of `StringTokenizer` so fed it some steroids and brought it back with a new name!
Qwerky
+2  A: 

use java.util.Scanner - you can call useDelimiter() to make the comma your delimiter, and read new tokens with next(). The Scanner can be created directly from your file or a string read from the file.

RD
+9  A: 

Libraries like OpenCSV handle all the weird cases for CSV files (new lines, delimiting, etc).

Joshua
@Joshua: Even though there are no "weird" cases presented, using a library will (1) reduce the chance of errors in parsing; (2) provide more features; (3) yield an extensible solution; and (4) readily integrate parsing of future CSV files (if required).
Dave Jarvis
+1 use a library that someone else has written. Why repeat work that someone has already done for you?
Dave DeLong
+2  A: 

A CSV file is a \n terminated file that each column can be seperated either by:

  • Comma or
  • Tabs \t

I suggest that you have a BufferedReader that reads the CSV file and use the readLine() method to read the row.

From each row, use String.split(arg) where arg will be your comma or tab \t to have an array of columns....from there, you know what to do.

The Elite Gentleman
The `C` in `CSV` stands for comma - google for `TSV` for "Tab Separated Values"
Stephen P
@Stephen P, indeed, but what stops anyone from putting tabs in a CSV file?
The Elite Gentleman
+1  A: 

By far the most useful page on the subject of CSV parsing I've ever found is the following:

http://secretgeek.net/csv_trouble.asp

Basically, get an established library to do it for you, because csv parsing is deceptively tricky.

John
Not at all tricky....it's a simple comma or tab delimited file.
The Elite Gentleman
@The Elite - didn't read the posted article, did you?
KevinDTimm
I did now....if Marcos could do it, so could anyone...*sarcastic laugh*
The Elite Gentleman
+3  A: 

String,split(",") isn't likely to work.
It will split fields that have embedded commas ("Foo, Inc.") even though they are a single field in the CSV line.

What if the company name is:
        Company, Inc.
or worse:
        Joe's "Good, Fast, and Cheap" Food


According to Wikipedia:    (http://en.wikipedia.org/wiki/Comma-separated_values)

Fields with embedded commas must be enclosed within double-quote characters.

   1997,Ford,E350,"Super, luxurious truck"

Fields with embedded double-quote characters must be enclosed within double-quote characters, and each of the embedded double-quote characters must be represented by a pair of double-quote characters.

   1997,Ford,E350,"Super ""luxurious"" truck"


Even worse, quoted fields may have embedded line breaks (newlines; "\n"):

Fields with embedded line breaks must be enclosed within double-quote characters.

   1997,Ford,E350,"Go get one now  
   they are going fast"



This demonstrates the problem with String,split(",") parsing commas:

The CSV line is:

a,b,c,"Company, Inc.", d, e,"Joe's ""Good, Fast, and Cheap"" Food", f, 10/11/2010,1/1/2011, g, h, i


// Test String.split(",") against CSV with
// embedded commas and embedded double-quotes in
// quoted text strings:
//
// Company names are:
//        Company, Inc.
//        Joe's "Good, Fast, and Cheap" Food
//
// Which should be formatted in a CSV file as:
//        "Company, Inc."
//        "Joe's ""Good, Fast, and Cheap"" Food"
//
//
public class TestSplit {
    public static void TestSplit(String s, String splitchar) {
        String[] split_s    = s.split(splitchar);

        for (String seg : split_s) {
            System.out.println(seg);
        }
    }


    public static void main(String[] args) {
        String csvLine = "a,b,c,\"Company, Inc.\", d,"
                            + " e,\"Joe's \"\"Good, Fast,"
                            + " and Cheap\"\" Food\", f,"
                            + " 10/11/2010,1/1/2011, h, i";

        System.out.println("CSV line is:\n" + csvLine + "\n\n");
        TestSplit(csvLine, ",");
    }
}


Produces the following:


D:\projects\TestSplit>javac TestSplit.java

D:\projects\TestSplit>java  TestSplit
CSV line is:
a,b,c,"Company, Inc.", d, e,"Joe's ""Good, Fast, and Cheap"" Food", f, 10/11/2010,1/1/2011, g, h, i


a
b
c
"Company
 Inc."
 d
 e
"Joe's ""Good
 Fast
 and Cheap"" Food"
 f
 10/11/2010
1/1/2011
 g
 h
 i

D:\projects\TestSplit>



Where that CSV line should be parsed as:


a
b
c
"Company, Inc."
 d
 e
"Joe's ""Good, Fast, and Cheap"" Food"
 f
 10/11/2010
1/1/2011
 g
 h
 i
Alan Jay Weiner
Nice to provide demonstration code.
Stephen P
thanks! glad to do so!
Alan Jay Weiner