tags:

views:

48

answers:

3

I have the following function:

void writeResults(FILE* fp, FILE* fpw, Vector w, int size) {

 Vector x;

 while (1) {

       char line[MAX_DIMENSION];  //max dimension is 200

       if( (fgets(line,MAX_DIMENSION,fp)) == NULL) { //EOF
            return;
       }
       else {
           int i=0;
           while (line[i]!='\0') {
                printf("% d %c\n",i,line[i]); //print to check it
                i++;
           }
       }
 }
}

The line of the file it reads is:

1,1
2,2

However when I print each char until '\0' I get this output:

 0 1  
 1 ,  
 2 1  
 3   
 4   
 0 2  
 1 ,  
 2 2  
 3   
 4   

Does anyone have a clue as to why it reads the extra 3 and 4 chars? (there is no extra spaces in the file).

Note: the file was opened in the following way:

FILE* fp = fopen(fileIn, "r");
if (fp == NULL) {
    perror("Couldn't open File");
    exit(errno);
}
+1  A: 

Carriage return, line feed - on Windows?

It would help if we knew how you opened the file. If you opened it as a text file, then you should not be seeing the two extra characters - just one for the newline. However, if you open it as a binary file, it should indeed read both the CR and the LF.

If you are on Linux, as indicated in the comments, then we have more diagnostic tools available. Perhaps the simplest to start with is 'od -c file'; this will show you exactly what is in the file. Note that if the file was ever on a Windows box, it could still have CRLF line endings. If you use 'vim', it might tell you the file type is '[dos]'.

Alternatively, you could print the characters as integers (as well as characters):

printf("%d (%2d) %c\n", i, line[i], line[i]);

You should see 49 for '1', 50 for '2', 44 for ',', 10 for newline (LF, '\n'), and something else too - that is the mystery (but it would show 13 for CR).


CR is the \r character in C source. It was used to indicate that the print head should move back to the start of the line (printer carriage returned to start of line); the LF or line feed scrolled the paper up a line. Windows and MS-DOS both use the CRLF (curr-liff) sequence to indicate an end of line; Unix always used just LF, aka newline or NL; MacOS 9 and earlier used just CR.

Jonathan Leffler
i don't really know what is carriage return... i'm running on linux
rob
OK - on Linux, it is much less likely to be the issue. I'll add some suggestions to my answer.
Jonathan Leffler
updated the question with openinig stream... however you are all right.. thanks for the answers
rob
you are right:1 , 1 \r \n 2 , 2 \r \nthis is what i got from od-cis there a way to bypass it? a function that won't read this from the file?
rob
@Mike: there's no easy way to avoid reading the characters. It is however entirely legitimate to simply ignore them. That is what most people would do. Or simply zap the line ending: newline or, in this case, CRLF at the end of each line before further processing. (Perl has a `chomp` statement that does exactly this.)
Jonathan Leffler
rob
@Mike: it depends on how much control you have over the source of the data, and/or how much you trust the source. But basically, yes - you only need to worry about CR and LF line endings in the ISO 8859-1 or 8859-15 code sets, or in Unicode (UTF-8). And you only need to worry about the CR (13, control-M, etc) if the data is coming from or via a Windows-based machine, which it appears to be.
Jonathan Leffler
+1  A: 

don't print %c, print %d and you will see the characters ascii code. you will find the characters are carriage return and line feed. 13 and 10

refer to http://www.asciitable.com/

Keith Nicholas
+1  A: 

I think your input file contains-

1,1
[SPACESPACE]
2,2
[SPACESPACE]

so first time fgets read like-

line{'1',',','1','',''}

and second time reads

line{'2',',','2','',''}

thats why, you are getting output as you specified

Sadat