I have a large csv. I want to delete the first line of the file. How is this done? I don't want to copy every line into an array and rewrite them for the previous index and delete the first. There must be a better way.
thank you
I have a large csv. I want to delete the first line of the file. How is this done? I don't want to copy every line into an array and rewrite them for the previous index and delete the first. There must be a better way.
thank you
Well, there are some shortcuts that you can take, but there are several things that you can't circumvent:
Depending on the encoding, a character might not map to a single byte in the file, so you have to read it as text.
You have to parse at least the first record of the file. The CSV format is not line based eventhough it uses line breaks to separate records. A value can also contain a line break, so you can't just read to the first line break and take for granted that this is the first record.
There is no way to delete part of a file, so whatever you do you still have to rewrite the entire file.
So, you can parse the header (if there is one) and the first record, then you can read the rest of the file as plain text. Then you can write the rest back at the position where the first record started (or write from the start of the file and include the header).
Although Guffa is right about having linebreaks in header is possible, that's not that usual, so if you're ok with ignoring that edge case, you can use:
File.open('new.csv', 'w+') do |outf|
File.open('original.csv') do |inf|
inf.each_line.with_index do |line, i|
outf.write line unless i==0
end
end
end
If this is too slow for you, let me know and we'll rewrite this to use block reading instead actually parsing the whole file.