tags:

views:

128

answers:

5

Hello everyone!

I am writing a C program that produces a large output file. To increase readability, I would like to collect certain kinds of output at certain points in the file rather than have it be scattered randomly about.

Consider a file like:

log
log
(a)

output
output
output(b)

Say the program is currently writing the line at (b). Is there a particularly elegant way in C to achieve the effect of moving to point (a), adding a line out output and then resuming normal output at (b)?

I know I could achieve this effect using standard shell tools such as csplit to break the file at the specified point, append output to the first half and then cat it back together. However, this application must be cross-platform so I can't count on having a shell available.

Any suggestions would be most helpful!

A: 

with fgetpos you can store a pointer to a position in a file and jump to it at any time with fseek:

Victor
A: 

C or C++? If C++, you can use seekp() to position the output pointer. This will only enable you to overwrite, though.

Donnie DeBoer
Unfortunately we're not using C++. Thanks for the pointer though- I will definitely check it out if I come across the same problem in C++!
Sharpie
+6  A: 

The only way to do what you're describing in a single file would be reserve all the space you were going to need to for the "log" entries up front, which I'm guessing you can't do because you don't know how big they're going to be.

You can't just insert into a file, moving up the contents above the insertion point to make room. It's just not a concept that common filesystems support. You'd need to physically read and re-write all the "output" pieces in order to insert a "log" piece, which would require increasingly large amounts of work as the file grew.

Your best bet would be to write two separate files, then join them together at the end.

Edit following Sharpie's comment: Since the output is a set of commands for a program, does that mean you can safely assume that it won't ever be more than a couple of MB big, and just build it in memory before writing it all out in one go?

RichieHindle
Our biggest test run currently produces an input file that is 2.1 MB, I would guess it could easily go an order of magnitude higher. It looks like writing a collection of temporary files and splicing them together may be the way to go.
Sharpie
+1 for "write 2 files and join them at the end"
AShelly
I'd vote for the splicing. Unless there is someway to determine upfront how long the file is going to be.
NoMoreZealots
+2  A: 

It is a very unusual requirement.

One way would be to prefix a differentiating string for the two kinds of output lines and then writing them into the same file. When you want a specific set of outputs, grep for them in the combined file.

output.txt...
MARK1: log
MARK2: output
MARK1: log
MARK2: output
MARK2: output

grep "^MARK1:" output.txt
grep "^MARK2:" output.txt


You are eventually going to need the file as you wished it to be written.
For that, you will need to chop off the prefix strings.
At that time it would be very easy if you used strings that matched a simple regex
Something like,

sed 's/^MARK.://g' output.txt > filtered.txt
nik
+4  A: 

You cannot do this directly, for a very simple reason:

There is no way to insert bytes in the middle of a file. You can only overwrite a piece of a file or append to the end of a file.

If you want to insert into the middle of a file, you have two options:

  • Read all the bytes after the position where you want to insert into memory. Then write the piece that you want to insert, then write all the bytes that you cached into memory. The result will be that bytes are inserted the given position, at a cost of an arbitrarily large block of memory to hold the piece that you have to "push back".
  • Alternatively, you write everything to a new file. Copy bytes from the source file into a new file. If you are at the point where you want to insert your new data, write the new line. Then continue copying from the original file until you're done. This works without requiring a possibly large chunk of memory, but requires you to use a temporary file in which you store the combined result.

...

  • Alternatively alternatively, don't write your output directly to a file, but keep everything in memory, and write it ordered to the output file once you're done collecting log lines.
rix0rrr
I agree especially on the "keep a list of log lines in memory before committing to a file" point.
RaphaelSP