Hi, I need to write a circular file in c++. The program has to write lines in a file and when the code reaches a maximum number of lines, it must overwrite the lines in the beginning of the file.
Anyone have any idea? Thanks
Hi, I need to write a circular file in c++. The program has to write lines in a file and when the code reaches a maximum number of lines, it must overwrite the lines in the beginning of the file.
Anyone have any idea? Thanks
Unfortunately you can't truncate/overwrite lines at the beginning of a file without rewriting the entire thing.
I've just thought of a new approach that might do the trick for you...
You could include a small header to your file that has the following structure.
Edit: Rubbish, I've just described a variant of a circular buffer!
Header Fields
Bytes 00 - 07 (long)
- Total (current) number of lines written to the file.Bytes 08 - 15 (long)
- Pointer to the start of the "actual" first line of your file. This will initially be the byte after the header ends, but will change later when data gets overridden.`Bytes 16 - 23 (long)
- Length of the "end section" of the file. Again, this will initially be zero, but will change later when data gets overridden.Read Algorithm (Pseudocode)
Reads the entire file.
Read the header field that points to the start of the "actual" first line Read the header field that specifies the length of the "end section" Read every line until the end of the file Seek to the byte just after the end of the header Read every line until the "end section" has been fully read
Write Algorithm (Pseudocode)
Writes an arbitrary number of new lines to the file.
Read the header field that contains the total no. of lines in the file If (line count) + (no. of new lines) <= (maximum no. of lines) Then Append new lines to end of file Increment header field for line count by (no. of ne lines) Else Append as many lines as possible (up to maximum) to end of file Beginning at pointer to first line (in header field), read as many lines as still need to be written Find the total byte count of the lines just read Set the header field that points to the first line to the next byte in the stream Keep writing the new lines to the end of the file, each at a time, until the byte count of the remaining lines is less than the byte count of the lines at the beginning of the file (it may be that this condition is true immediately, in which case you don't need to write any more) Write the remaining new lines to the start of the file (starting at the byte after the header) Set the header field that contains the length of the "end section" of the file to the number of bytes just written after the header.
Not a terribly simple algorith, I fully admit! I nonetheless think it's quite elegant in a way. Let me know if any of that isn't clear, of course. Hopefully it should do precisely what you want now.
Now, if you're lines are guaranteed to be of constant length (in bytes), you could easily enough just seek back to the appropiate point and overwrite existing data. This would seem like a rather unlikely situation however. If you don't mind imposing the restriction that your lines must have a maximum length, and additionally padding each of the lines you write to this maximum length, then that could make matters easy for you. Still, it has its disadvantages such as greatly increasing file size under certain circumstances (i.e. most lines are much shorted than the maximum length.) It all depends on the situation whether this is acceptable or not...
Finally, you may instead want to look at utilising an existing logging system, depending on your exact purpose.
that's going to be tricky since file I/O works with bytes as the underlying unit of storage, and not lines.
I mean you could just fseek() back to the beginning and clobber the earlier data, but I have a hunch that's not what you want.
The usual way to handle logging that doesn't explode in size is to use rolling log files, and roll them once a day or similar, and only keep the N latest files.
For instance, every day you create a new logfile with the filename `application_2009_05_20.log', and start writing to it, always appending.
Once you have 14 days worth of logfiles, you start deleting the oldest.
Since files are byte-oriented, and you need a line-oriented service, you have two choices:
implement a line-oriented wrapper around the file
switch to some line-oriented device. Just of the top of my mind: SQLite has some nice C++ wrappers available.
Simple solution:
This solution is designed to offer a constant file length rather than a constant number of lines within the file. The number of lines will vary over time depending on length. This solution makes it more difficult to seek to specific line numbers quickly, though you could stick some indicator data at the top or bottom of the file to make this easier.
"Clever" Solution (variation on solution above):
Just use the same trick that is sometimes used for deques. Just expicitely wrap around from the beginning of the file to the end, but keep track of where the beginning/end of the file is. You could write an unwrap utility to convert this file to a standard one when you wished to read it with a program that did not support it. This solution is REALLY easy to implement, but I like the version above better.
Ugly Solution:
When adding lines, add a moderate amount of padding to every line you add.
Each time you wish add a new line, do as follows:
Note that this will work pretty poorly unless your lines are pretty consistent about length. A simpler solution is to guarantee lines are of constant length (but put in some way to create multi-line "lines" in case you exceed that length.
I've seen this done by keeping the current write position for the file somewhere. When you need to add a line, you seek to the position, write the line, and update the position in an atomic fashion. If you overflow, then you seek to zero before you write the line. We do this today for size constrained circular log files. Doing it on a line-constrained basis is a little odd, but could probably be done in a similar fashion. Our write loop looks something like:
logFile.lockForWrite();
currentPosition = logFile.getWritePosition();
logFile.seek(currentPosition);
for each line in lineBuffer {
if ((currentPosition+line.length()) > logFile.getMaxSize()) {
currentPosition = 0;
logFile.seek(0);
}
logFile.write(line);
currentPosition += line.length();
}
logFile.setWritePosition(currentPosition);
logFile.unlock();
The tricky part is in maintaining the current write position and finding some way to coordinate reading the file (e.g., with the tail
utility) while your application is writing to it. Your reader utility has to keep track of the write position as well so it's read loop becomes:
lastPosition = logFile.getWritePosition();
while (!killed) {
logFile.wait();
logFile.lockForRead();
newPosition = logFile.getWritePosition();
logFile.seek(lastPosition);
newLine = logFile.readFrom(lastPosition, (newPosition-lastPosition));
lastPosition = newPosition;
logFile.unlock();
}
This isn't in any particular language - it's just pseudocode but the idea is there. Of course, I left the handling all of the interesting edge cases to the reader.
With all of that said... I agree with the other opinions. Don't do this unless you have a really good reason. It sounds like a great idea, but:
grep
, tail
, perl
, etc.Overall, you will be better off using some existing package logging package that allows for configurable log file management. Take a look at Apache's log4cxx or Poco's Poco::Logger
.
if the files need to be text files:
This is very problematic with varying line lengths. Your first two lines are 80 characters each, how do you overwrite that with a 100 character line?
If the new line should replace the first line, this would result in an file insert, which is a very expensive operation (basically, the entire remainder of the file needs to be read and written). You really don't want to do that for all but minimal amounts of data.
If this is for logging purpose, use rollng log files - e.g. one a day (as suggested by lassevek). I made it even simpler: when file size exceeds a limit, the old file is renamed to .bak (old .bak is deleted), and start anew. With an 1MB limit, this preserves e.g. the last 1 MB, while never occupying more than 2 MB.
You could employ a similar mechanism with two or more files. Basically, move the "rollover" to files, rather than to lines.
if the file may be in a proprietary format:
Use a basic DB engine (like SQLite as suggested), or another structured storage mechanism.
Easy workaround:
You could use log4cxx
with a RollingFileAppender
to write this information into a log file. The RollingFileAppender
will handle rolling over the log file when it reaches a certain size. I don't think it's exactly what you want, but it's fairly simple -- maybe it will do.
Just create a mapping of the file of the required size (CreateFileMapping or mmap), write the lines in the buffer and start over when the maximum number is reached.
Use a circular buffer and write the buffer to a file for each add.
Here is a small and simple code size solution. It is a simple circular buffer of strings and each time you add strings it writes the entire buffer of strings to the file (of course you incur a significant cost for writing all the strings for a single add operation. So this is only suitable for a small number of strings).
Simple implementation of circular buffer with output to a file:
// GLOBALS ( final implementation should not use globals )
#define MAX_CHARS_PER_LINE (1024)
#define MAX_ITEMS_IN_CIRCULARBUF (4) // must be power of two
char lineCircBuf[MAX_ITEMS_IN_CIRCULARBUF][MAX_CHARS_PER_LINE];
int lineCircBuf_add = 0;
int lineCircBuf_rmv = 0; // not being used right now
uint32_t lineCircBuf_mask = MAX_ITEMS_IN_CIRCULARBUF-1;
char FILENAME[] = "lineCircBuf.txt";
FILE * ofp = NULL;
int addLine(char * str) {
int i;
// Error checking
if( strlen(str) > MAX_CHARS_PER_LINE ) {
return -1; // failure
}
if( ofp != NULL) {
fclose(ofp);
}
// Copy string into circular buffer
strncpy( &(lineCircBuf[lineCircBuf_add][0]),
str,
MAX_CHARS_PER_LINE );
lineCircBuf_add = ( lineCircBuf_add + 1 ) & lineCircBuf_mask;
// Write to file
ofp = fopen(FILENAME,"w");
for( i = 0; i < MAX_ITEMS_IN_CIRCULARBUF-1; i++ ) {
fprintf( ofp, "%s\n", lineCircBuf[i] );
}
fprintf( ofp, "%s", lineCircBuf[i] ); // do not add a newline to the last line b/c we only want N lines in the file
return 0; // success
}
int removeLine(int index) {
// not implemented yet
}
void unitTest() {
int i;
// Dummy text to demonstrate adding string lines
char lines[5][MAX_CHARS_PER_LINE] = {
"Hello world.",
"Hello world AGAIN.",
"The world is interesting so far!",
"The world is not interesting anymore...",
"Goodbye world."
};
// Add lines to circular buffer
for( i = 0; i < sizeof(lines)/sizeof(lines[0]); i++ ) {
addLine(&(lines[i][0]));
}
}
int main() {
unitTest();
return 0;
}
So in the above example we had 5 lines of input and our buffer was only 4 lines long. Therefore the output should have 4 lines only and the first line should be overwritten by the last line "Goodbye world". Sure enough the first line of the output confirms does have "Goodbye world":
Goodbye world.
Hello world AGAIN.
The world is interesting so far!
The world is not interesting anymore...
If you're wanting to generate this file for input to another application, I think your best bet would be to log directly to a relation database (SQL Server, MySQL, whatever..) Then periodically generate that file as needed from the logged data.
To get around the variable size thing, you will probably end up with an indirection and an allocation scheme. This would consist of an indirection block with a fixed number of 'pointers' into the file, and one 'next-to-be-written' pointer, that would wrap around N.
But the main trick would be in adding the indirection.