tags:

views:

139

answers:

5

Say we have opened a text file. Is it possible to define a pointer to a character in the file? If it is - will the following characters in the file appear in memory in the same order they appear in the file?

The reason I ask:
I need to process a text file. I read one line at a time and there are certain strings I want to keep. The buffer I read into always changes, so I can't keep a pointer into it. On the other hand, I don't want to waste space by defining an array and strcpy the characters from the text file into it.

I actually want to access a file as though it were an in-memory array.

Edit:

I can use only C standard library functions. But thanks for the other suggestions, anyway.

+3  A: 

What result are you trying to achieve?

If you want to access a file as though it were an in-memory array, you can do this using a memory mapped file.

mmap is the POSIX function to do this.

therefromhere
I need something which is OS independent.
Leif Ericson
The Apache Portable Runtime provides functions which are OS independent. apr_mmap() does mmap, for instance.
Avi
Well you can wrap mmap on POSIX and MapViewOfFile into your own abstraction, covering most popular platforms. There are probably such wrappers out there already (maybe boost?)
EFraim
Thanks. It is a school assignment. I can only use C standard library.
Leif Ericson
+1  A: 

As I see it, unless you are willing to use mmap as therefromhere suggested, your options are to copy the strings (at the cost of some space) or record the byte-positions in the original file and then re-read from those locations. I'd certainly opt for the former unless you have a very good reason not to.

Draemon
I already thought about recording the offset and using fseek, but thanks.
Leif Ericson
It's better but it has the cost of O(n/2) if the file length is n bytes.
Leif Ericson
Then you're just going to have to copy the strings
Draemon
Yep. I think so :)
Leif Ericson
+1  A: 

The short answer is 'no'. As therefromhere said, you can use the POSIX function mmap or the win32 CreateFileMapping. However AFAIK if the file is going to change, it can change in any moment while you are reading it, so probably the best solution is to strcpy it.

Edit: There's no file mapping in C standard library. So your options are now 1) Keeping the offset of each string in file (slow, but wastes little space) 2) Copying the strings (relatively slow copy, then instant access to them, wastes some more space).

However, if you know which strings you must keep after reading them, you can change your buffer only in those cases, and reuse them when you can.

Jaime Pardos
A: 

I don't understand the requirements - you want to keep some strings accessible by pointer (i.e. in memory), but you don't want to 'waste space' by allocating any memory to the task?

Is it the strcpy performance you object to? If so, why not allocate a new buffer every time you read a line that you want to keep? Obviously you'd need to track the previous buffers with the saved lines as well...

JimG
No. I don't. I need only the buffer that holds the current processed line, and if there is a string in it that I want to keep, I should strcpy it into an allocated space. No need to allocating a new buffer for each line.
Leif Ericson
Well yeah, that's exactly what you should do, but that's what you discounted as "wasting space" in your question. It didn't really make much sense.
JimG
It would have make sense if it had been possible to define a pointer such as I described.
Leif Ericson
+6  A: 
  1. You could duplicate the lines you read, but this isn't acceptable because you think this will suck up too much memory.

  2. You could use mmap() to create a memory-mapped file, but this isn't acceptable because you want something that is OS independant.

  3. You could keep a dynamic record of the file positions, but this isn't acceptable for reasons you haven't really elucidated.

  4. Finally, you could simulate memory-mapped files by slurping the file into one huge buffer, but how this would be a valid solution when (1) isn't is beyond me.

These are, in sum, all the possible solutions to your problem. However, none of them are satisfatory because your requirements are too restrictive.

The answer to your question is that there is no answer.

Cirno de Bergerac
There is an answer. That with C standard library, I cannot treat an opened file like a in-memory array.
Leif Ericson
Define "treat an opened file like an in-memory array". I don't see how seek+read doesn't meet these criteria.
Draemon