On a Linux machine I would like to traverse a folder hierarchy and get a list of all of the distinct file extensions within it.
What would be the best way to achieve this from a shell?
On a Linux machine I would like to traverse a folder hierarchy and get a list of all of the distinct file extensions within it.
What would be the best way to achieve this from a shell?
Try this (not sure if it's the best way, but it works):
find . -type f | perl -ne 'print $1 if m/\.([^.\/]+)$/' | sort -u
It work as following:
Reursive version:
find . -type f | sed -e 's/.*\.//' | sed -e 's/.*\///' | sort -u
If you want totals (how may times the extension was seen):
find . -type f | sed -e 's/.*\.//' | sed -e 's/.*\///' | sort | uniq -c | sort -rn
Non-recursive (single folder):
for f in *.*; do printf "%s\n" "${f##*.}"; done | sort -u
I've based this upon this forum post, credit should go there.
Find everythin with a dot and show only the suffix.
find . -type f -name "*.*" | awk -F. '{print $NF}' | sort -u
if you know all suffix have 3 characters then
find . -type f -name "*.???" | awk -F. '{print $NF}' | sort -u
or with sed shows all suffixes with one to four characters. Change {1,4} to the range of characters you are expecting in the suffix.
find . -type f | sed -n 's/.*\.\(.\{1,4\}\)$/\1/p'| sort -u
Since there's already another solution which uses Perl:
If you have Python installed you could also do (from the shell):
python -c "import os;e=set();[[e.add(os.path.splitext(f)[-1]) for f in fn]for _,_,fn in os.walk('/home')];print '\n'.join(e)"
None of the replies so far deal with filenames with newlines properly (except for ChristopheD's, which just came in as I was typing this). The following is not a shell one-liner, but works, and is reasonably fast.
import os, sys
def names(roots):
for root in roots:
for a, b, basenames in os.walk(root):
for basename in basenames:
yield basename
sufs = set(os.path.splitext(x)[1] for x in names(sys.argv[1:]))
for suf in sufs:
if suf:
print suf
Powershell: dir -recurse | select-object extension -unique
thanks to http://kevin-berridge.blogspot.com/2007/11/windows-powershell.html