views:

157

answers:

2

how does large text file viewer work?

I'm assuming that:

  • Threading is used to handle the file
  • The TextBox is updated line by line
  • Effective memory handling is used

Are these assumptions correct? if someone were to develop their own, what are the mustsand don'ts?

I'm looking to implement one using a DataGrid instead of a TextBox

I'm comfortable with C++ and python. I'll probably use QT/PyQT

EDIT

The files, I have are usually between 1.5 to 2 GB. I'm looking at editing and viewing these files

+6  A: 

I believe that the trick is not loading the entire file into memory, but using seek and such to just load the part which is viewed (possibly with a block before and after to handle a bit of scrolling). Perhaps even using memory-mapped buffers, though I have no experience with those.

Do realize that modifying a large file (fast) is different from just viewing it. You might need to copy the gigabytes of data surrounding the edit to a new file, which may be slow.

extraneon
+4  A: 

In Kernighan and Plaugher's classic (antique?) book "Software Tools in Pascal" they cover the development and design choices of a version of ed(1) and note

"A warning: edit is a big program (excluding contributions from translit, find, and change; at 950 lines, it is fifty percent bigger than anything else in this book."

And they (literally) didn't even have string types to use. Since they note that the file to be edited may exist on tape which doesn't support arbitrary writes in the middle, they had to keep an index of line positions in memory and work with a scratch file to store changes, deletions and additions, merging the whole together upon a "save" command. They, like you, were concerned about memory constraining the size of their editable file.

The general structure of this approach is preserved in the GNU ed project, particularly in buffer.c

msw
I know that you can overwrite data in a file (on disk), but inserts in a file are also on disk not supported I think.
extraneon
It is of historical interest only, but since you asked... You are correct, most disk file systems will let you *overwrite* stuff in the middle of a file but not insert. Because of the mechanics of some tape encodings, writing in the middle doesn't overwrite the data there, it borks the entire file.
msw