Ok, so I'm trying to delete lines from a text file with java. Currently the way I'm doing this, is I'm keep track of a line number and inputting an index. The index is the line I want deleted. So each time I read a new line of data I increment the line count. Now when I reach the line count that is the same index, I dont write the data to the temporary file. Now this works, but what if for example I'm working with huge files and I have to worry about memory restraints. How can I do this with.. file markers? For example.. place the file marker on the line I want to do delete. Then delete that line? Or is that just too much work?
You could use nio to delete the region of the file that correspond to that line.
EDIT added some hints
By creating a FileChannel
and using a Buffer
, you could open the file, erase the required line by pushing over it the content that come after.
Unfortunatly, I must confess my knowledge of nio stops approximatly here ...
You could use a random access file. Keep a pointer to the byte you are reading and another for the byte you are writing. Fill a buffer with data and as you read it count the lines. If you have nothing to delete reset the channel to the write pointer and output the buffer, then reset the channel to the read pointer. If you find a line to delete, output the buffer to that point at the write index, then increment the read pointer until you find the end of the line, and then output the remainder of your buffer (refilling the buffer as necessary), repeat for each line to be deleted.
Ideally, I would use an ETL tool to perform this kind of batch work. Assuming you do not have access to such a tool, I would recommend gZipping the file first and then read it using java.util.zip.
Here is a good tutorial on how to do it.
Hope this helps!
Don't keep the file in memory, just read it one line at a time and write it out to the temporary file one line at at a time skipping the line that needs to be deleted.