tags:

views:

80

answers:

4

I've been trying to implement a bash script that reads from wordnet's online database and have been wondering if there is a way to remove a variety text files with one command.

Example FileDump:

**** Noun ****
(n)hello, hullo, hi, howdy, how-do-you-do (an expression of greeting) "every morning they exchanged polite hellos"
**** Verb ****
(v)run (move fast by using one's feet, with one foot off the ground at any given time) "Don't run--you'll be out of breath"; "The children ran to the store"
**** Adjective ****
(adj)running ((of fluids) moving or issuing in a stream) "as mountain stream with freely running water"; "hovels without running water"

I just need to remove the lines which describe aspects of grammar e.g.

**** Noun ****
**** Verb ****
**** Adjective ****

So that I have a clean file with only definitions of the words:

(n)hello, hullo, hi, howdy, how-do-you-do (an expression of greeting) "every morning they exchanged polite hellos"
(v)run (move fast by using one's feet, with one foot off the ground at any given time) "Don't run--you'll be out of breath"; "The children ran to the store"
(adj)running ((of fluids) moving or issuing in a stream) "as mountain stream with freely running water"; "hovels without running water"

The * symbols around the grammatical terms are tripping me up in sed.

+4  A: 

If you want to select whole lines from a file based just on the content of those lines, grep is probably the most suitable tool available. However, some characters, such as your stars, have special meanings to grep, so need to be "escaped" with a backslash. This will print just the lines starting with four stars and a space:

grep "^\*\*\*\* " textfile

However, you want to keep the lines which don't match that, so you need the -v option for grep which does just that: prints the lines which don't match the pattern.

grep -v "\*\*\*\* " textfile

That should give you what you want.

Tim
Your explanation, and second piece of code both helped me, and achieved the outcome.
+1  A: 
 sed 's/^*.*//g' test | grep .
Vijay Sarathi
Yours achieved the solution as well - but you were just a tad to late for the tick. Both answers achieve the outcome of the question.
+2  A: 
sed '/^\*\{4\} .* \*\{4\}$/d'

or a bit looser

sed '/^*\{4\}/d'
pixelbeat
Your command worked as well.
A: 
# awk '!/^\*\*+/' file
(n)hello, hullo, hi, howdy, how-do-you-do (an expression of greeting) "every morning they exchanged polite hellos"
(v)run (move fast by using one's feet, with one foot off the ground at any given time) "Don't run--you'll be out of breath"; "The children ran to the store"
(adj)running ((of fluids) moving or issuing in a stream) "as mountain stream with freely running water"; "hovels without running water"
ghostdog74