tags:

views:

4987

answers:

4

Hello all,

I'm trying to do something like

read -d EOF stdin

for word in $stdin; do stuff; done

where I want to replace 'EOF' for an actual representation of the end of file character.

Edit: Thanks for the answers, that was indeed what I was trying to do. I actually had a facepalm moment when I saw stdin=$(cat) lol

Just for kicks though how would you go about matching something like a C-d (or C-v M-v etc), basically just a character combined with Control, Alt, Shift, whatever in bash?

+3  A: 

Two things...

The EOF character is represented by C-d (or C-v C-d if you want to type it), but to do what you're trying, it's better to do this:

while read line; do stuff "${line}"; done

Daniel Nadasi
EOF is not a control-d. Control-d is just the most common setting for the keystroke to signal EOF.
Darron
+5  A: 

There isn't an end-of-file character really. When you press Ctrl-d or similar characters, the terminal driver signals to the reading application that the end of file has been reached, by returning an invalid value. The same is done by the operation system, when you have reached the end of the file. This is done by using an integer instead of a byte (so you have range similar to -2^16 .. 2^16, instead of only 0..255) and returning an out-of-range value - usually -1. But there is no character that would represent eof, because its whole purpose is to be not a character. If you want to read everything from stdin, up until the end of file, try

stdin=$(cat)
for word in $stdin; do stuff; done

That will however read the whole standard input into the variable. You can get away with only allocating memory for one line using an array, and make read read words of a line into that array:

while read -r -a array; do 
    for word in "${array[@]}"; do 
        stuff;
    done
done
Johannes Schaub - litb
In Unix, EOF *is* a character. See http://en.wikipedia.org/wiki/End-of-transmission_character
ashawley
you are wrong, while the article is right. it says "EOT is used to signal an end-of-file" and my answer says "When you press ... the terminal driver signals to the reading application that the end of file has been reached". That signalling could be done by EOT, but it's not "the EOF character"
Johannes Schaub - litb
rather, it's *one* way to trigger an EOF read. another way is for an operation system when the end of file has been reached... or for your toaster when there has not been any user interaction... you clearly mix up different concepts.
Johannes Schaub - litb
EOT/EOF is a "control character" in ASCII. See <http://en.wikipedia.org/wiki/ASCII>. Where is a reference of this "terminal driver" behavior you describe? It's new to me.
ashawley
the terminal driver is a piece of software that implement an interface consisting of read/write and so on.. like any other driver. it's writing/reading from a terminal or an emulator thereof ( http://en.wikipedia.org/wiki/Computer_terminal )
Johannes Schaub - litb
the EOT you refer to is only one way to trigger an EOF to the reading program. it's not "the eof character". as said, eof isn't a character. you can change what character triggers an EOF by saying "stty eof A" (or any other character). then, pressing "A" on the terminal will signal "EOF".
Johannes Schaub - litb
but that does not at all mean that now "A is eof" . i were never in the situation where i had to write a terminal emulator or driver therefor, but i of course know the basic stuff about it. the driver for example usually implements the echoing (displaying characters that were pressed).
Johannes Schaub - litb
i see where you're confused. i can agree with you that true EOF is not a character. but it's not the terminal that manages it. what you're describing is still ^D (0x04). i think i've found the answer.
ashawley
Handling of the "true EOF" is done at a lower level by the operating system--on Unix-ish systems, this is the standard library. See <http://en.wikipedia.org/wiki/C_file_input/output>.
ashawley
So, there are two EOF values. In the context of Bash (or terminals as you tried to explain), there *is* an EOF character. Denying its existence confuses the question. Thanks for the discussion. I don't program in C, so I often need to remind myself how this works.
ashawley
i'm not confused about it. as you see on my answer i know what an eof is and as you see on my other answers i know about C input/output functions :) as you see on this answer, i also know that the operation system can signal an eof when reading a file out of its bound.
Johannes Schaub - litb
and EOF has no value. it's a condition. like "network communication ended", or "transmission ended", or "end of file reached" or whatever. anyway i still don't see the problem with my answer. i think i will never understand the problem with it :) please read it again, maybe you overlooked something
Johannes Schaub - litb
if it really contains any serious error or misinformation, we will find out some day. also like Darron comments on the answer of Daniel, he would probably also comment on the answer of yours. C-d is only a very common way to trigger an EOF condition. cheers :)
Johannes Schaub - litb
I guess my criticism isn't that you're wrong or confused, just that you mention a definition of EOF that doesn't apply in shell programming.
ashawley
+2  A: 

litb & Daniel are right, I will just answer your "Just for kick" question: Bash (as any command line unix program in general) only see characters as bytes. So you cannot match Alt-v, you will match whatever bytes are sent to you from the UI (pseudo-tty) that interpret these keypresses from the users. It can even be unix signals, not even bytes. It will depend on the terminal program used, the user settings and all kind of things so I would advise you not try to match them.

But if you know that your terminal sends C-v as the byte number 22 (0x16), you can use things like:

if test "$char" = '^V'; then...

by entering a real ^V char under your editor (C-q C-v under emacs, C-v C-v under an xterm , ...), not the two chars ^ and V

Colas Nahaboo
A: 

To find what a control character is, run

$ cat | od -b
^D
0000000 004 012
0000002

I typed ^V^D after issuing the command, and then RET and another ^D (unquoted) and the result is that EOF is octal 004.

Combining that result with read(1):

$ read -d "$(echo -e '\004')" stdin
foo
bar quuz^Hx
^D
$ echo "$stdin"
foo
bar quux
$ for word in $stdin; do echo $word; done
foo
bar
quux

Yes, I typed ^H above for backspace to see if read(1) did the right thing. It does.

ashawley
EOT is 0x04. EOF is nothing. just do ^D only on the terminal, and you see od displays no byte, precisely *because* EOF is no character but just a symbolic number. if it were as you say, how could we ever read a binary file containing 0x04??
Johannes Schaub - litb
I tried it, and it stops reading the file. At the user level, EOF is 0x04. At the programmer's level it's not a character, it's a value. You wouldn't be using read with a binary file, anyway. You should be programming in C.
ashawley