I suppose 'awk' is one tool for the job, though I think 'sed' is simpler for this particular operation. The specification is a bit vague. The simple version is:
- Find the first line containing a given word.
- Delete that line and all following lines.
For that, I'd use 'sed':
sed '/word/,$d' file
The more complex version is:
- Find the first line containing a given word.
- Delete the text on that line from the word onwards.
- Delete all subsequent lines of text.
I'd probably still use 'sed':
sed -n '1,/word/{s/word.*//;p}' file
This inverts the logic. It doesn't print anything by default, but for lines 1 until the first line containing word it does a substitute (which does nothing until the line containing the word), and then print.
Can it be done in 'awk'? Not completely trivially because 'awk' autosplits input lines into words, and because you have to use functions to do substitutions.
awk '/word/ { if (found == 0) {
# First line with word
sub("word.*", "")
print $0;
found = 1
}
}
{ if (found == 0) print $0; }' file
(Edited: change 'delete' to 'found' since 'delete' is a reserved word in 'awk'.)
In all these examples, the truncated version of the input file is written to standard output. To modify the file in situ, you either need to use Perl or Python or a similar language, or you capture the output in a temporary file which you copy over the original once the command has completed. (If you try 'script file' you process an empty file.)
There are various early exit optimizations that could be applied to the sed and awk scripts, such as:
sed '/word/q' file
And, if you assume the use of the GNU versions of awk or sed, there are various non-standard extensions that can help with in-situ modification of the file.