views:

115

answers:

2

I have a data that looks like this (FASTA format). Note that in comes with block of 2 ">" header and the sequence.

>SRR018006
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGN
>SRR018006
ACCCGCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC

What I want to do is to append a text (e.g. "foo" in the > header) yielding:

>SRR018006-foo
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGN
>SRR018006-foo
ACCCGCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC

Is there a way to do that using SED? Preferably inline modifying the original file.

+4  A: 

This will do what you're looking for.

sed -ie 's/^\(>.*\)/\1-foo/' file
EmFi
+3  A: 

since judging from your previous post, you are also experienced using awk: here's an awk solution.

# awk '/^>/{print $0"-foo";next}1' file
>SRR018006-foo
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGN
>SRR018006-foo
ACCCGCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC

# awk '/^>/{print $0"-foo";next}1' file > temp
# mv temp file

if you insist on sed

# sed -e '/^>/s/$/-foo/' file
ghostdog74
Despite your negative tone, I'd say your suggestion with sed is much cleaner than the other two using awk, and probably much more efficient.
Idelic