views:

24

answers:

1

Hi,

I have a very large text file, and my iPad program needs to occasionally replace a single line of data (newline: '\n'). How can I do this efficiently? Writing the entire file at once is not an option.

I've been looking into NSFileHandle, and it looks like it will do what I want; right now I'm having trouble figuring out how to find line X, and then replace it with the data from a string. I think after that's done, all I need to do is call synchronizeFile, right?

I appreciate your help!

A: 

You cannot really do this without writing the whole file. You could seek to the beginning of the line and then write the new line. But first you’d have to find out the offset of that line. If you don’t already have that this means reading the file from the beginning to that line. Then you could write the new line, but only if it is exactly the same length as the original line. If it is longer it will overwrite the next line - there is no way to insert data into a file. If the new line is shorter than the old one the end of the old line will remain. The same length requirement is also tricky. This means the same length in bytes. Depending on the character encoding some characters might require more bytes than others.

If you really need to do this and have it work for about every case you’d have to use those steps:

  1. Read the entire file to find out the offset of the line you’re interested in
  2. Seek to the offset of the line
  3. Write the new line
  4. Write out the rest of the file you read in step 1.

This algorithm will work, no matter how long or short the lines are or how they are encoded. But this will probably be more expensive than just writing out the whole file, especially if you have it in memory anyways.

Have you actually verified that it is not acceptable to write out the whole file or are you doing premature optimization here? If your text file is really that big you should be considering a database, like SQLite, or to use Core Data.

Sven
A database would be nice, but unfortunately it won't work; the app is designed to update a massive csv database. I suppose it would be possible for the app to convert the csv file to a database for normal operation, and convert the database back to csv on command, but converting the array back to string format for writing is expensive for the size of the csv file, and could lead to errors on the part of people retrieving the csv file from the app.
JoBu1324
Converting the csv database to an array is actually relatively inexpensive - I found a decent C library to do most of the work pretty fast. It takes about 1 second to open the file and convert it to a multidimensional array. Using -stringByAppendingString: or -stringByAppendingFormat:to recompose the csv file is the expensive part, and it takes at least a minute (I've never bothered to find out how long it runs; 1 second might be acceptable, but 2 seconds would be too long). Would converting the NSStrings into c-strings or NSData make composing the csv file faster? Any other ideas?
JoBu1324
Do you really use `stringByAppendingFormat:` to recreate the whole CSV file into a single string? That has to slow, because the whole string you’re appending to has to be copied every single time. Using a `NSMutableString` instead should speed things up, especially if you initialize it using `initWithCapacity:` to reserve enough space. Instead of building the whole contents as a single string it might even be faster to write the file line-by-line.
Sven
by write the file line by line do you mean truncate the file and write each csv line to disk as I re-compose it? Why would that be faster than writing the entire file to disk at once?
JoBu1324
Thanks for the attention, Sven. I've spent some time streamlining the save process and I now write in the background, so I've been able to alleviate the problem. Thanks for your answer - it was informative!
JoBu1324