ansaurus

Question

Answer 1

+1 A:

Carriage return, line feed - on Windows?

It would help if we knew how you opened the file. If you opened it as a text file, then you should not be seeing the two extra characters - just one for the newline. However, if you open it as a binary file, it should indeed read both the CR and the LF.

If you are on Linux, as indicated in the comments, then we have more diagnostic tools available. Perhaps the simplest to start with is 'od -c file'; this will show you exactly what is in the file. Note that if the file was ever on a Windows box, it could still have CRLF line endings. If you use 'vim', it might tell you the file type is '[dos]'.

Alternatively, you could print the characters as integers (as well as characters):

printf("%d (%2d) %c\n", i, line[i], line[i]);

You should see 49 for '1', 50 for '2', 44 for ',', 10 for newline (LF, '\n'), and something else too - that is the mystery (but it would show 13 for CR).

CR is the \r character in C source. It was used to indicate that the print head should move back to the start of the line (printer carriage returned to start of line); the LF or line feed scrolled the paper up a line. Windows and MS-DOS both use the CRLF (curr-liff) sequence to indicate an end of line; Unix always used just LF, aka newline or NL; MacOS 9 and earlier used just CR.

Jonathan Leffler 2010-08-21 21:28:20

i don't really know what is carriage return... i'm running on linux

rob 2010-08-21 21:29:41

OK - on Linux, it is much less likely to be the issue. I'll add some suggestions to my answer.

Jonathan Leffler 2010-08-21 21:32:06

updated the question with openinig stream... however you are all right.. thanks for the answers

rob 2010-08-21 21:38:29

you are right:1 , 1 \r \n 2 , 2 \r \nthis is what i got from od-cis there a way to bypass it? a function that won't read this from the file?

rob 2010-08-21 21:45:49

@Mike: there's no easy way to avoid reading the characters. It is however entirely legitimate to simply ignore them. That is what most people would do. Or simply zap the line ending: newline or, in this case, CRLF at the end of each line before further processing. (Perl has a `chomp` statement that does exactly this.)

Jonathan Leffler 2010-08-21 21:48:42

rob 2010-08-21 21:52:43

@Mike: it depends on how much control you have over the source of the data, and/or how much you trust the source. But basically, yes - you only need to worry about CR and LF line endings in the ISO 8859-1 or 8859-15 code sets, or in Unicode (UTF-8). And you only need to worry about the CR (13, control-M, etc) if the data is coming from or via a Windows-based machine, which it appears to be.

Jonathan Leffler 2010-08-21 21:57:54

Answer 2

+1 A:

don't print %c, print %d and you will see the characters ascii code. you will find the characters are carriage return and line feed. 13 and 10

refer to http://www.asciitable.com/

Keith Nicholas 2010-08-21 21:33:42

Answer 3

+1 A:

I think your input file contains-

1,1
[SPACESPACE]
2,2
[SPACESPACE]

so first time fgets read like-

line{'1',',','1','',''}

and second time reads

line{'2',',','2','',''}

thats why, you are getting output as you specified

Sadat 2010-08-21 21:36:24

ansaurus

tags:

views:

answers:

fgets reads too many chars than exists

related questions