With Perl regular expressions:
$ echo 'this is a long string that needs to be shortened' \
| perl -pe 's/^(.{15}).+/$1.../'
this is a long ...
The easiest way to think about regular expressions is to consider it a pattern that needs to be matched. In this case the pattern begins with the beginning of the line:
^
(Note that /
is an arbitrary separator. Other characters could be used instead.) The ^
is the symbol that represents the start of the line in a regex. Next the regex matches any character:
^.
A .
is the regex symbol for any character. But we want to match the first 15 characters:
^.{15}
There are several different modifiers that represent a repetition. The most common is *
which signifies 0 or more. A +
indicates 1 or more. {15}
obviously represents exactly 15. (The {...}
notations is more general. So *
could be written {0,}
and +
is the same as {1,}
.) Now we need to capture the first 15 characters so that we can use them later:
^(.{15})
Everything between (
and )
is captured and placed in a special variable called $1
(or sometimes \1
). The second chunk captured would be placed in $2
and so on. Finally, you need to match to the end of the line so that you can throw that part away:
^(.{15}).+
I initially used *
, but as another person pointed out, that probably isn't what is wanted when the string is exactly 15 characters long:
$ echo 'this is a long ' \
| perl -pe 's/^(.{15}).*/$1.../'
this is a long ...
Using a +
means the pattern will not match if there is not a 16th character to replace.
The second half of the statement is what gets printed:
$1...
The $1
variable that we caught earlier is used and the dots are literal .
s on this side of the substitution. Generally, everything except regex variables are literal on the right side of a substitution statement.