tags:

views:

81

answers:

2

hey there, I'm reading lines off of text file and I'm wondering if this is a good way to go?. I had to write the function numberoflines to decrease the number_of_lines variable by one because within the while loop, for every line it read it adds 2 to the number_of_lines variable.

#include <iostream>
#include <fstream>
using namespace std;

int number_of_lines = 0;

void numberoflines();
int main(){
    string line;
    ifstream myfile("textexample.txt");

    if(myfile.is_open()){
        while(!myfile.eof()){
            getline(myfile,line);
            cout<< line << endl;
            number_of_lines++;
        }
        myfile.close();
    }
    numberoflines();

}

void numberoflines(){
    number_of_lines--;
    cout<<"number of lines in text file: " << number_of_lines << endl;
}

is there any other easier better way?

+1  A: 

I think your question is, "why am I getting one more line than there is in the file?"

Imagine a file:

line 1
line 2
line 3

The file may be represented in ASCII like this:

line 1\nline 2\nline 3\n

(Where \n is byte 0x10.)

Now let's see what happens before and after each getline call:

Before 1: line 1\nline 2\nline 3\n
  Stream: ^
After 1:  line 1\nline 2\nline 3\n
  Stream:         ^

Before 2: line 1\nline 2\nline 3\n
  Stream:         ^
After 2:  line 1\nline 2\nline 3\n
  Stream:                 ^

Before 2: line 1\nline 2\nline 3\n
  Stream:                 ^
After 2:  line 1\nline 2\nline 3\n
  Stream:                         ^

Now, you'd think the stream would mark eof to indicate the end of the file, right? Nope! This is because getline sets eof if the end-of-file marker is reached "during it's operation". Because getline terminates when it reaches \n, the end-of-file marker isn't read, and eof isn't flagged. Thus, myfile.eof() returns false, and the loop goes through another iteration:

Before 3: line 1\nline 2\nline 3\n
  Stream:                         ^
After 3:  line 1\nline 2\nline 3\n
  Stream:                         ^ EOF

How do you fix this? Instead of checking for eof(), see if .peek() returns EOF:

while(myfile.peek() != EOF){
    getline ...

You can also check the return value of getline (implicitly casting to bool):

while(getline(myfile,line)){
    cout<< ...
strager
The return value of `getline` resolves to `false` in a bool context when it fails; *i.e.*, it tried to read past the EOF.
greyfade
@greyfade, This is correct, but that doesn't change anything I've said. I did add an example of using that at the end, though.
strager
+6  A: 

Your hack of decrementing the count at the end is exactly that -- a hack.

Far better to write your loop correctly in the first place, so it doesn't count the last line twice.

int main() { 
    int number_of_lines = 0;
    std::string line;
    std::ifstream myfile("textexample.txt");

    while (std::getline(myfile, line))
        ++number_of_lines;
    std::cout << "Number of lines in text file: " << number_of_lines;
    return 0;
}

Personally, I think in this case, C-style code is perfectly acceptable:

int main() {
    unsigned int number_of_lines = 0;
    FILE *infile = fopen("textexample.txt", "r");
    int ch;

    while (EOF != (ch=getc(infile)))
        if ('\n' == ch)
            ++number_of_lines;
    printf("%u\n", number_of_lines);
    return 0;
}

Edit: Of course, C++ will also let you do something a bit similar:

int main() {
    std::ifstream myfile("textexample.txt");

    // new lines will be skipped unless we stop it from happening:    
    myfile.unsetf(std::ios_base::skipws);

    // count the newlines with an algorithm specialized for counting:
    unsigned line_count = std::count(
        std::istream_iterator<char>(myfile),
        std::istream_iterator<char>(), 
        '\n');

    std::cout << "Lines: " << line_count << "\n";
    return 0;
}
Jerry Coffin
Lest it go unnoticed: thanks for the correction @strager.
Jerry Coffin
+1 for the C-style code. I would suggest doing block reads rather than using `fgetc`, since a function call to read every character incurs a rather high overhead.
casablanca
@casablanca: oops -- should have been `getc`, which is typically implemented as a macro.
Jerry Coffin