views:

590

answers:

7

I'm writing a program which is supposed to read two strings that can contain line breaks and various other characters. Therefore, I'm using EOF (Ctrl-Z or Ctrl-D) to end the string.

This works fine with the first variable, but with the second variable, however, this seems to be problematic as apparently something is stuck in the input buffer and the user doesn't get to type in anything.

I tried to clean the buffer with while (getchar() != '\n'); and several similar variations but nothing seems to help. All cleaning attempts have resulted in an infinite loop, and without cleaning, adding the second variable is impossible.

The characters for both of the variables are read in a loop like this: while((c = getchar()) != EOF), which would suggest it is EOF what I have stuck in my buffer. Or does it affect the behavior of the program in some other way? Is there something wrong with the logic I'm using?

I'm starting to get bit desperate after struggling with this for hours.

[edit: added code below]

[edit 2: clearerr() seems to make this EOF solution work after all.

It seems to run in its original form like I intended under Linux, I was trying it with Windows yesterday.]

code:

#include <stdio.h>
#include <string.h>

int main(void)
{
    int x = 0;
    int c;
    char a[100];
    char b[100];

    printf("Enter a: ");
    while((c = getchar()) != EOF)
    {
     a[x] = c;
     x++;
    }
    a[x] = '\0';
    x = 0;

    /*while (getchar() != '\n'); - the non-working loop*/

    printf("\nEnter b: ");
    while((c = getchar()) != EOF)
    {
     b[x] = c;
     x++;
    }
    b[x] = '\0';

    printf("\n\nResults:\na: %s\n", a);
    printf("b: %s\n", b);

    return(0);
}

[edit 3:]

Dynamic memory issue:

My program is also supposed to handle strings longer than 100 characters. Originally I intended to solve that by dynamic memory allocation, but when I had problems with the infinite loop described above and memory-related crashes I left it out and switched over to char[100].

I think what I tried was generally something like this:

while((c = getchar()) != EOF)
{
  a = malloc(sizeof(char));
  a[x] = c;
  x++;
}

Is that a possible (or sensible) way to do that? I'm trying to allocate more memory for every character that's being handled there. Individually. With code like that (this example contains probably syntax errors) I experienced crashes, so looks to me malloc might not be the right function here, or I'm trying it wrong. Supposing it's even possible.

+8  A: 

After you received an EOF from the terminal, you will not receive any additional data. There is no way of un-EOF-ing the input - the end of the file is, well, the end.

So you should define that each variable is input on a separate line, and have users press enter instead of EOF. You still need to check whether you have received eof, because that means that the user actually typed EOF, and you won't see anything else - in this case, you need to break out of the loop and print an error message.

Martin v. Löwis
Okay, so EOF can't be used like I inteded. Thanks.Being able to add several lines into the same variable is quite important here, is there any sensible way to do that?
Arcthae
There are several convention: a) an empty line (double enter) will terminate the input; this should work fine unless your multi-line input should also allow for empty lines. b) some stop character (often ".", e.g. in SMTP) will end the input; the assumption is that this is unlikely to occur in real text.
Martin v. Löwis
You could do something ... tricky ... and prevent normal EOF behavior (an EOF is still an EOF, but a <kbd>Control-D</kbd> need not send it, for instance). This is beyond the scope of C though.
pst
Two line breaks to terminate the input and a stop character seem both bit hard ways to do this. It is possible that two line breaks would be part of the actual input, as well as a stop character, unless you pick one that wouldn't occur in the input... Those might be too hard for the users to type in, though. Maybe I'll need to rethink this. Thanks again.
Arcthae
A: 

What you are trying is fundamentally impossible with EOF.

Although it behaves like one in some ways, EOF is not a character in the stream but an environment-defined macro representing the end of the stream. I haven't seen your code, but I gather you're doing is something like this:

while ((c=getchar()) != EOF) {
    // do something
}
while ((c=getchar()) != EOF) {
    // do something else
}

When you type the EOF character the first time, to end the first string, the stream is irrevocably closed. That is, the status of the stream is that it is closed.

Thus, the contents of the second while loop are never run.

Benji XVI
Added some code now. I saw some programs using it so I tried to use it here without knowing its true nature.
Arcthae
Yes, so your code is pretty much identical to what I expected.If this program need only be run on the command line, my suggestion is to separate the strings with the null character. On my system (OS X) this is invoked with Ctrl-@ (ie Ctrl-Shift-2 on this keyboard).
Benji XVI
A: 
David
Thanks for pointing it out. How do you enter null in the program?
Arcthae
The null character is just a byte with the value `0`, but it's usually entered using the escape code `\0`, which you can use inside string literals and character literals (although note that C will automatically put a null character at the end of any string that you define using a literal, because it uses this to signal the end of the string).
David
+1  A: 

EOF isn't a character - it's a special value that the input functions return to indicate a condition, that the "end of file" on that input stream has been reached. As Martin v. Löwis says, once that "end of file" condition occurs, it means that no more input will be available on that stream.

The confusion arises because:

  • Many terminal types recognize a special keystroke to signal "end of file" when the "file" is an interactive terminal (eg. Ctrl-Z or Ctrl-D); and
  • The EOF value is one of the values that can be returned by the getchar() family of functions.

You will need to use an actual character value to separate the inputs - the ASCII nul character '\0' might be a good choice, if that can't appear as a valid value within the inputs themselves.

caf
Yep, I used without actually knowing the details about it. Thank you, the nul character would be handy indeed as it could be used to terminate the string as well, but when I tried several alternatives, I don't think I was able to produce it. How do you get it?
Arcthae
Arcthae: That depends on your system, but Ctrl-@ works on many terminals.
caf
A: 

Rather than stopping reading input at EOF -- which isn't a character -- stop at ENTER.

while((c = getchar()) != '\n')
{
    if (c == EOF) /* oops, something wrong, input terminated too soon! */;
    a[x] = c;
    x++;
}

EOF is a signal that the input terminated. You're almost guaranteed that all inputs from the user end with '\n': that's the last key the user types!!!


Edit: you can still use Ctrl-D and clearerr() to reset the input stream.

#include <stdio.h>

int main(void) {
  char a[100], b[100];
  int c, k;

  printf("Enter a: "); fflush(stdout);
  k = 0;
  while ((k < 100) && ((c = getchar()) != EOF)) {
    a[k++] = c;
  }
  a[k] = 0;

  clearerr(stdin);

  printf("Enter b: "); fflush(stdout);
  k = 0;
  while ((k < 100) && ((c = getchar()) != EOF)) {
    b[k++] = c;
  }
  b[k] = 0;

  printf("a is [%s]; b is [%s]\n", a, b);
  return 0;
}
$ ./a.out
Enter a: two
lines (Ctrl+D right after the next ENTER)
Enter b: three
lines
now (ENTER + Ctrl+D)
a is [two
lines (Ctrl+D right after the next ENTER)
]; b is [three
lines
now (ENTER + Ctrl+D)
]
$
pmg
The point is to be able to enter input with line breaks though. Like was pointed out, I can try to figure out other character I can use like I intended to use EOF here, or maybe I should to come up with a new approach towards the problem.Your example - stopping at enter could work too, but you'd need another loop and still a way to determinate when the user is switching over to the second variable, which allows line breaks as well.
Arcthae
Hmmm ... `clearerr()` resets the stream :)
pmg
Thanks! So there is a way to do this the way I thought after all. :DIs reseting the whole stream kind of like shooting flies, though?
Arcthae
`getchar()` returns `EOF` when it tries to read after the end of file or if there was an error. `clearerr()` tells the program to ignore the end of file or error. Usually there's no point ignoring the signal: if the file ended, it won't magically have new data after ignoring `EOF` and if there was an error (network breaks, media removed, ...) `clearerr()` won't magically correct the error. For your specific problem, this should work -- but users cannot redirect input.
pmg
Okay. Thanks, I think this is indeed perfect for my program. Btw, I see you test if k is below 100 in your code. Originally I intended to solve that by dynamic memory allocation, but when I had problems with that infinite loop and memory-related crashes I left it out and switched over to char[100].I tried to enter example code of what I tried with the memory allocation here, but kind of failed, so I edited my question once again. (Or should I make a new question? I'm kind of new to the site and its general policy)
Arcthae
I think it's better to make a new question. And my snippet should really test for `k < 99` instead to allow for the following terminating zero.
pmg
A: 

I run the code on my linux box, here is the result:

Enter a: qwer
asdf<Ctrl-D><Ctrl-D>
Enter b: 123
456<Ctrl-D><Ctrl-D>

Results:
a: qwer
asdf
b: 123
456

Two Ctrl-D was needed because the terminal input buffer was not empty.

sambowry
Oh... This is odd. I tried it now on Linux myself and it works (with one `Ctrl-D` per variable). Yesterday I was compiling on Windows and it worked completely differently.
Arcthae
If You end the last line with a <Return>, You need only one <Ctrl-D>.
sambowry
A: 

How do you enter null in the program?

You can implement the -print0 function using:

putchar(0);

This will print an ASCII nul character '\0' to sdtout.

hansu
I formatted the question bit badly. I'm reading input from the user's keyboard, and I'm trying to get two strings that can contain line breaks into two variables and I need something that the user can end each string with. I originally used EOF, and some people told me to switch using to NULL, but I wasn't able to produce that character with my keyboard. Probably because of the environment (Windows). Tried `Ctrl-@` and pretty much everything else but didn't seem to get \0.
Arcthae