views:

31753

answers:

8

How to split string based on delimiter in bash?

I have this string stored in a variable:

IN="[email protected];[email protected]"

Now I would like to split the strings by ';' delimiter so that I have

ADDR1="[email protected]"
ADDR2="[email protected]"

Don't necessarily need ADDR1, ADDR2 variables, if they are elements of an array that's even better.

Edit: After suggestions from answers below I ended up with the following which is what I was after:

#!/usr/bin/env bash

IN="[email protected];[email protected]"

arr=$(echo $IN | tr ";" "\n")

for x in $arr
do
    echo "> [$x]"
done

output:

> [[email protected]]
> [[email protected]]

Edit2: There was a solution involving setting IFS to ';', not sure what happened with that answer, how do you reset IFS back to default?

Edit3: RE: IFS solution, I tried this and it works, I keep the old IFS and then restore it:

IN="[email protected];[email protected]"

OIFS=$IFS
IFS=';'
arr2=$IN
for x in $arr2
do
    echo "> [$x]"
done

IFS=$OIFS

Btw, when I tried

arr2=($IN)

I only got the first string when printing it in loop, without brackets around $IN it works.

+8  A: 

If you don't mind processing them immediately, I like to do this:

for i in $(echo $IN | tr ";" "\n")
do
  # process
done

You could use this kind of loop to initialize an array, but there's probably an easier way to do it. Hope this helps, though.

Chris Lutz
tried using IFS=';' ADDR=($IN) , but i'm not sure how IFS behave afterwards so i removed my answer :( giving u +1 thou since i like it
Johannes Schaub - litb
You should have kept the IFS answer. It taught me something I didn't know, and it definitely made an array, whereas this just makes a cheap substitute.
Chris Lutz
I see. Yeah i find doing these silly experiments, i'm going to learn new things each time i'm trying to answer things. I've edited stuff based on #bash IRC feedback and undeleted :)
Johannes Schaub - litb
-1, you're obviously not aware of wordsplitting, because it's introducing two bugs in your code. one is when you don't quote $IN and the other is when you pretend a newline is the only delimiter used in wordsplitting. You are iterating over every WORD in IN, not every line, and DEFINATELY not every element delimited by a semicolon, though it may appear to have the side-effect of looking like it works.
lhunath
I would listen to you closer, Ihunath, and understand that you're right, if you weren't being as much of a jerk about it. While this is certainly not perfect (and I know that - I've seen wordsplitting before), if you know you're working with a list of semicolon-separated email addresses, good-enough is often better than technically correct. I upvoted the IFS answer (and even recommended that it be undeleted) because it's a better answer, but for the OP's problem this was good enough. There's a reason for the phrase "good enough."
Chris Lutz
You could change it to echo "$IN" | tr ';' '\n' | while read -r ADDY; do # process "$ADDY"; done to make him lucky, i think :) Note that this will fork, and you can't change outer variables from within the loop (that's why i used the <<< "$IN" syntax) then
Johannes Schaub - litb
@Chris: People being satisfied with "good enough" is the reason why 99.9% of all bash scripts in existance are a danger to anyone using them because of race conditions, bugs and security issues. "Good enough" is not good enough; and definitely not recommended (and any advice given is a recommendation to the one asking).
lhunath
+3  A: 
echo "[email protected];[email protected]" | sed -e 's/;/\n/g'
[email protected]
[email protected]
lothar
I think this is good as well, I took the first suggestion using tr
stefanB
+7  A: 

You can set the IFS variable, and then let it parse into an array. When this happens in a command, then the assignment to IFS only takes place to that single command's envionment ( to read ). It then parses the input according to the IFS variable value into an array, which we can then iterate over.

IFS=';' read -ra ADDR <<< "$IN"
for i in "${ADDR[@]}"; do 
    # process "$i"
done

It will parse one line of items separated by ;, pushing it into an array. Stuff for processing whole of $IN, each time one line of input separated by ;:

 while IFS=';' read -ra ADDR; do 
      for i in "${ADDR[@]}"; do
          # process "$i"
      done 
 done <<< "$IN"
Johannes Schaub - litb
This is probably the best way. How long will IFS persist in it's current value, can it mess up my code by being set when it shouldn't be, and how can I reset it when I'm done with it?
Chris Lutz
now after the fix applied, only within the duration of the read command :)
Johannes Schaub - litb
I knew there was a way with arrays, just couldn't remember what it was. I like setting the IFS but am not sure with the redirect from $IN and go through read just to populate array. Isn't just restoring IFS easier? Anyway +1 fro IFS suggestion, thanks.
stefanB
I didn't like this saved="$IFS"; IFS=';'; ADDR=($IN); IFS="$saved" mess. :)
Johannes Schaub - litb
You can read everything at once without using a while loop:read -r -d '' -a addr <<< "$in" # The -d '' is key here, it tells read not to stop at the first newline (which is the default -d) but to continue until EOF or a NULL byte (which only occur in binary data).
lhunath
lhunath, ah nice idea :) However when i say "-d ''", then it always adds a linefeed as last element to the array. I don't know why that is :(
Johannes Schaub - litb
+3  A: 

How about this approach:

IN="[email protected];[email protected]" 
set -- "$IN" 
IFS=";"; declare -a Array=($*) 
echo "${Array[@]}" 
echo "${Array[0]}" 
echo "${Array[1]}"

Taken from:

antonolsen.com/2006/04/10/bash-split-a-string-without-cut-or-awk/

+1, nice, I like that it does not use any external tools
stefanB
A: 

How about this one liner, if you're not using arrays:

IFS=';' read ADDR1 ADDR2 <<<$IN
Darron
A: 

You may also:

dirList=(
some
list
of
elements
)

for i in ${dirList[@]}; do
...
done
dmilith
A: 

This works:

echo "one two three four" | { read FIRST REST ; echo "{$FIRST} {$REST}"; }

But this does not:

echo "one two three four" | read FIRST REST ; echo "{$FIRST} {$REST}"; 

How is this possible?

Andor