views:

161

answers:

3

I'm trying to think of a way to count the number of lines in a .csv file using Javascript, any useful tips or resources someone can direct me to?

A: 

To count the number of lines in a document (once you have it as a string in Javascript), simply do:

var lines = csvString.split("\n").length;
Robusto
Ah cool. Didnt really think of that solution, Ill try it out. Thanks!
alvincrespo
+2  A: 

Depends what you mean by a line. For simple number of newlines, Robusto's answer is fine.

If you want to know how many rows of CSV data that represents, things may be a little more difficult, as a CSV field may itself contain a newline:

field1,"field
two",field3

...is one row, at least in CSV as defined by RFC4180. (It's one of the aggravating features of CSV that there are so many non-standard variants; the RFC itself was very late to the game.)

So if you need to cope with that case you'll have to essentially parse each field.

A field can be raw, or (necessarily if it contains \n or ,) quoted, with " represented as double quotes. So a regex for one field would be:

"([^"]|"")*"|[^,\n]*

and so for a whole row (assuming it is not empty):

("([^"]|"")*"|[^,\n]*)(,("([^"]|"")*"|[^,\n]*))*\n

and to get the number of those:

var rowsn= csv.match(/(?:"(?:[^"]|"")*"|[^,\n]*)(?:,(?:"(?:[^"]|"")*"|[^,\n]*))*\n/g).length;

If you are lucky enough to be dealing with a variant of CSV that complies with RFC4180's recommendation that there are no " characters in unquoted fields, you can make this a bit more readable. Split on newlines as before and count the number of " characters in each line. If it's an even number, you have a complete line; if it's an odd number you've got a split.

var lines= csv.split('\n');
for (var i= lines.length; i-->0;)
    if (lines[i].match(/"/g).length%2===1)
        lines.splice(i-1, 2, lines[i-1]+lines[i]);
var rowsn= lines.length;
bobince
A: 

You can use the '.' to match everything on a line except the newline at the end- it won't count quoted new lines. Use the 'm' for multiline flag, as well as 'g' for global.

function getLines(s){
    return s.match(/^(.*)$/mg);
}

alert(getLines(string).length)

If you don't mind skipping empty lines it is simpler- but sometimes you need to keep them for spaceing.

function getLines(s){ return s.match(/(.+)/g); }

kennebec