I'm trying to think of a way to count the number of lines in a .csv file using Javascript, any useful tips or resources someone can direct me to?
To count the number of lines in a document (once you have it as a string in Javascript), simply do:
var lines = csvString.split("\n").length;
Depends what you mean by a line. For simple number of newlines, Robusto's answer is fine.
If you want to know how many rows of CSV data that represents, things may be a little more difficult, as a CSV field may itself contain a newline:
field1,"field
two",field3
...is one row, at least in CSV as defined by RFC4180. (It's one of the aggravating features of CSV that there are so many non-standard variants; the RFC itself was very late to the game.)
So if you need to cope with that case you'll have to essentially parse each field.
A field can be raw, or (necessarily if it contains \n
or ,
) quoted, with "
represented as double quotes. So a regex for one field would be:
"([^"]|"")*"|[^,\n]*
and so for a whole row (assuming it is not empty):
("([^"]|"")*"|[^,\n]*)(,("([^"]|"")*"|[^,\n]*))*\n
and to get the number of those:
var rowsn= csv.match(/(?:"(?:[^"]|"")*"|[^,\n]*)(?:,(?:"(?:[^"]|"")*"|[^,\n]*))*\n/g).length;
If you are lucky enough to be dealing with a variant of CSV that complies with RFC4180's recommendation that there are no "
characters in unquoted fields, you can make this a bit more readable. Split on newlines as before and count the number of "
characters in each line. If it's an even number, you have a complete line; if it's an odd number you've got a split.
var lines= csv.split('\n');
for (var i= lines.length; i-->0;)
if (lines[i].match(/"/g).length%2===1)
lines.splice(i-1, 2, lines[i-1]+lines[i]);
var rowsn= lines.length;
You can use the '.' to match everything on a line except the newline at the end- it won't count quoted new lines. Use the 'm' for multiline flag, as well as 'g' for global.
function getLines(s){
return s.match(/^(.*)$/mg);
}
alert(getLines(string).length)
If you don't mind skipping empty lines it is simpler- but sometimes you need to keep them for spaceing.
function getLines(s){ return s.match(/(.+)/g); }