This is still a loop, since it uses the branch command in sed
:
find -depth -type d |sed 'h; :b; $b; N; /^\(.*\)\/.*\n\1$/ { g; bb }; $ {x; b}; P; D'
Based on a script in info sed
(uniq work-alike).
Edit Here is the sed
script broken out with comments (copied from info sed
and modified):
# copy the pattern space to the hold space
h
# label for branch (goto) command
:b
# on the last line ($) goto the end of
# the script (b with no label), print and exit
$b
# append the next line to the pattern space (it now contains line1\nline2
N
# if the pattern space matches line1 with the last slash and whatever comes after
# it followed by a newline followed by a copy of the part before the last slash
# in other words line2 is different from line one with the last dir removed
# see below for the regex
/^\(.*\)\/.*\n\1$/ {
# Undo the effect of
# the n command by copying the hold space back to the pattern space
g
# branch to label b (so now line2 is playing the role of line1
bb
}
# If the `N' command had added the last line, print and exit
# (if this is the last line then swap the hold space and pattern space
# and goto the end (b without a label)
$ { x; b }
# The lines are different; print the first and go
# back working on the second.
# print up to the first newline of the pattern space
P
# delete up to the first newline in the pattern space, the remainder, if any,
# will become line1, go to the top of the loop
D
Here is what the regex is doing:
/
- start a pattern
^
- matches the beginning of the line
\(
- start a capture group (back reference subexpression)
.*
- zero or more (*) of any character (.)
\)
- end capture group
\/
- a slash (/) (escaped with \
)
.*
- zero or more of any character
\n
- a newline
\1
- a copy of the back reference (which in this case is whatever was between the beginning of the line and the last slash)
$
- matches the end of the line
/
- end the pattern