tags:

views:

47

answers:

5

Hi, I have put together this one-liner that prints all the words in a file on different lines: sed -e 's/[^a-zA-Z]/\n/g' test_input | grep -v "^$"

If test_input contains "My bike is fast and clean", the one-liner's output will be:
My
bike
is
fast
and
clean

What I would need now is a different version that prints all the 2-word terms in the text, like this (still with the Bash):
My bike
bike is
is fast
fast and
and clean

Would you know how to do it?

+2  A: 

Pipe your word file to this script's standard input.

#! bash
last_word=""
while read word
do
  if [ $last_word != "" ] ; then
      echo $last_word $word
  fi
  last_word=$word
done
mobrule
A: 

This probably requires GNU sed and there's probably a simpler way:

sed 's/[[:blank:]]*\<\(\w\+\)\>/\1 \1\n/g; s/[^ ]* \([^\n]*\)\n\([^ ]*\)/\1 \2\n/g; s/ \n//; s/\n[^ ]\+$//' inputfile
Dennis Williamson
+1  A: 

This also works:

paste  <(head -n -1 test.dat) <(tail +2 test.dat)
fgm
A: 

to your command add:

| awk '(PREV!="") {printf "%s %s\n", PREV, $1} {PREV=$1}'
depesz
+1  A: 

use awk for this, no need anything else

$ echo "My bike is fast and clean" | awk '{for(i=1;i<NF;i++){printf "%s %s\n",$i,$(i+1) } }'
My bike
bike is
is fast
fast and
and clean
ghostdog74
sweet and simple:)
Vijay Sarathi