views:

647

answers:

4

Why does the following...

c=0; for i in $'1\n2\n3\n4'; do echo iteration $c :$i:; c=$[c+1]; done

print out...

iteration 0 :1 2 3 4:

and not

iteration 0 :1:
iteration 1 :2:
iteration 2 :3:
iteration 3 :4:

From what I understand, the $'STRING' syntax should allow me to specify a string with escape characters. Shouldn't "\n" be interpreted as newline so that the for loop echos four times, once for each line? Instead, it seems as if the newline is interpreted as a space character.

I took unwind's suggestion and tried setting $IFS. The results were same.

IFS=$'\n'; c=0; for i in $'1\n2\n3\n4'; do echo iteration $c :$i:; c=$[c+1]; done; unset IFS;

iteration 0 :1 2 3 4:

William Purssel says in a comment that this did not work because IFS was being set to newline... but following did not work.

IFS=' '; c=0; for i in '1 2 3 4'; do echo iteration $c :$i:; c=$[c+1]; done; unset IFS;

iteration 0 :1 2 3 4:

Using IFS=' ' on newline-separated string resulted in even more mess...

IFS=' '; c=0; for i in $'1\n2\n3\n4'; do echo iteration $c :$i:; c=$[c+1]; done; unset IFS;

iteration 0 :1
2
3
4:

setting IFS to '\n' rather than $'\n' had the same effect as IFS=' ' ...

IFS='\n'; c=0; for i in $'1\n2\n3\n4'; do echo iteration $c :$i:; c=$[c+1]; done; unset IFS;

iteration 0 :1
2
3
4:

There's only one iteration, but the newline is visible in the echo for some reason.

What did work is first storing the string in a variable then looping over the contents of the variable (without having to set IFS):

c=0; v=$'1\n2\n3\n4'; for i in $v; do echo iteration $c :$i:; c=$[c+1]; done

iteration 0 :1:
iteration 1 :2:
iteration 2 :3:
iteration 3 :4:

Which still does not explain why there is this problem.

Is there a pattern here? Is this the expected behavior of IFS as defined in unwind's link?

unwind's link states... "The shell scans the results of parameter expansion, command substitution, and arithmetic expansion that did not occur within double quotes for word splitting."

I guess that explains why string literals don't get split for for-loop iteration no matter what escape characters are used. Only when the literal is assigned to a variable then that variable is expanded to be split for the for-loop does it work. I guess also with command substitution.

Examples:

Result of command substitution is split

c=0; for i in `echo $'1\n2\n3\n4'`; do echo iteration $c :$i:; c=$[c+1]; done

iteration 0 :1:
iteration 1 :2:
iteration 2 :3:
iteration 3 :4:

Portion of the string that was expanded is split, rest is not.

c=0; v=$'1 \n\t2\t3 4'; for i in $v$'\n5\n6'; do echo iteration $c :$i:; c=$[c+1]; done

iteration 0 :1:
iteration 1 :2:
iteration 2 :3:
iteration 3 :4 5 6:

When expansion happen in double quotes, no splitting occurs.

c=0; v=$'1\n2\n3 4'; for i in "$v"; do echo iteration $c :$i:; c=$[c+1]; done

iteration 0 :1 2 3 4:

Any sequence of SPACE, TAB, NEWLINE is used as delimiter for splitting.

c=0; v=$'1 2\t3 \t\n4'; for i in $v; do echo iteration $c :$i:; c=$[c+1]; done

iteration 0 :1:
iteration 1 :2:
iteration 2 :3:
iteration 3 :4:

I will accept unwind's answer as his link yields the answer to my question.

No clue as to why behavior of echo within for-loop changes with value of IFS.

EDIT: extended to clarify.

+3  A: 

Change your $IFS setting to change how bash splits text into words.

unwind
I've tried the following and it failed. I'll edited my question to reflect this. IFS=$'\n'; for i in $'1\n2\n3\n4'; do echo $i; done; unset IFS;
EMPraptor
The problem is that IFS contains newline. Try IFS=' '
William Pursell
it puzzles me as to why IFS=' ' works when the string is $'1\n2\n3\n4' and not when it is '1 2 3 4'
EMPraptor
AAAAAHHHH!!! IFS=' ' does NOT work on $'1\n2\n3\n4'. It somehow causes \n to become visible in the echo. See revised question for details.
EMPraptor
I accepted your answer because the link to specification of IFS contained the answer I sought.
EMPraptor
@emraptor: Glad it helped, although I confess I didn't quite realize the answer would be that complicated. That might be because I didn't know about the $'' construct.
unwind
+1  A: 

Two reasons:

  1. Your for loop loops only once: there is only one element to loop on, which is the $'1\n2\n3\n4' string. If you want to loop four times, you have to change $IFS, as suggested by unwind.

  2. echo takes this string, and interprets it as four arguments separated by newlines. It then displays all arguments separated by whitespaces. If you want that echo doesn't interpret the input string, put it in double quotes, as in echo "$i".

Edit, after question edit:

  • I tried changing $IFS: it worked, but I used export $IFS='\n'

  • In your second case, $v gets interpreted by bash in for command which interprets it as four arguments separated by newlines. If you want to get your first problem again, just use for f in "$v" instead of for f in $v.

mouviciel
I've extended my question after unsuccessfully trying unwind's suggestion. I found the for loop does iterate over the same string as long as it is first stored in a variable - without messing with IFS.
EMPraptor
IFS='\n' also works. What doesn't work is IFS=$'\n'. I'm very very confused right now.
EMPraptor
I took unwind's and your suggestion to print more details and it turns out IFS='\n' and export IFS='\n' only causes echo to print \n as newlines. Setting IFS this way did not cause for loop to iterate 4 times.
EMPraptor
+3  A: 

Bash doesn't do word expansion on quoted strings in this context. For example:

$ for i in "a b c d"; do echo $i; done
a b c d

$ for i in a b c d; do echo $i; done
a
b
c
d

$ var="a b c d"; for i in "$var"; do echo $i; done
a b c d

$ var="a b c d"; for i in $var; do echo $i; done
a
b
c
d

In a comment, you stated "IFS='\n' also works. What doesn't work is IFS=$'\n'. I'm very very confused right now."

In IFS='\n', you're setting the separators (plural) to the two characters backslash and "n". So if you do this (inserting an "X" in the middle of a "\n") you see what happens. It's treating the "\n" sequences literally in spite of the fact you have them in $'':

$ IFS='\n'; for i in $'a\Xnb\nc\n'; do echo $i; done; rrifs
a X b
c

Edit 2 (in response to the comment):

It sees '\n' as two characters (not newline) and $'a\Xnb\nc\n' as a literal string of 10 characters (no newlines) then echo outputs the string and interprets the "\n" sequence as a newline (since the string is "marked" for interpretation), but since it's quoted it's seen as one string rather than words delimited by $IFS.

Try these for further comparison:

$ c=0; for i in "a\nb\nc\n"; do echo -e "iteration $c :$i:"; c=$[c+1]; done
iteration 0 :a
b
c
:

$ c=0; for i in "a\nb\nc\n"; do echo "iteration $c :$i:"; c=$[c+1]; done
iteration 0 :a\nb\nc\n:

$ c=0; for i in a\\nb\\nc\\n; do echo -e "iteration $c :$i:"; c=$[c+1]; done
iteration 0 :a
b
c
:

$ c=0; for i in a\\nb\\nc\\n; do echo "iteration $c :$i:"; c=$[c+1]; done
iteration 0 :a\nb\nc\n:

Setting IFS has no effect on the above.

This works (note that $var is unquoted in the for statement):

$ var=$'a\nb\nc\n'
$ saveIFS="$IFS"   # it's important to save and restore $IFS
$ IFS=$'\n'        # set $IFS to a newline using $'\n' (not '\n')
$ c=0; for i in $var; do echo -e "iteration $c :$i:"; c=$[c+1]; done
iteration 0 :a:
iteration 1 :b:
iteration 2 :c:
$ IFS="$saveIFS"
Dennis Williamson
Try the following and let me know what you think is happening: IFS='\n'; c=0; for i in $'a\Xnb\nc\n'; do echo iteration $c :$i:; c=$[c+1]; done
EMPraptor
A: 

try

c=0; for i in $'1\\n2\\n3\\n4'; do echo -e iteration $c :$i:; c=$[c+1]; done

the extra backslashes preserve the escapes for the newlines, the echo -e tells echo to expand the escapes.

Matt Willtrout
Sorry. Didn't work for me. And it shouldn't work since specification for bash says it won't.
EMPraptor