views:

387

answers:

5

I have a directory called "images" filled with about one million images. Yep.

I want to write a shell command to rename all of those images into the following format:

original: filename.jpg
new: /f/i/l/filename.jpg

Any suggestions?

Thanks,
Dan

+2  A: 

You can generate the new file name using, e.g., sed:

$ echo "test.jpg" | sed -e 's/^\(\(.\)\(.\)\(.\).*\)$/\2\/\3\/\4\/\1/'
t/e/s/test.jpg

So, you can do something like this (assuming all the directories are already created):

for f in *; do
   mv -i "$f" "$(echo "$f" | sed -e 's/^\(\(.\)\(.\)\(.\).*\)$/\2\/\3\/\4\/\1/')"
done

or, if you can't use the bash $( syntax:

for f in *; do
   mv -i "$f" "`echo "$f" | sed -e 's/^\(\(.\)\(.\)\(.\).*\)$/\2\/\3\/\4\/\1/'`"
done

However, considering the number of files, you may just want to use perl as that's a lot of sed and mv processes to spawn:

#!/usr/bin/perl -w
use strict;

# warning: untested
opendir DIR, "." or die "opendir: $!";
my @files = readdir(DIR); # can't change dir while reading: read in advance
closedir DIR;
foreach my $f (@files) {
    (my $new_name = $f) =~ s!^((.)(.)(.).*)$!$2/$3/$4/$1/;
    -e $new_name and die "$new_name already exists";
    rename($f, $new_name);
}

That perl is surely limited to same-filesystem only, though you can use File::Copy::move to get around that.

derobert
thanks for that solution, it is an interesting approach
Dan
oh, I notice one thing that testing would have found: There need to be a test "is this a file?" so it doesn't move the directories. Fairly easy to fix (e.g., `-f $f or next;` at the top of the perl foreach loop, similar in the shell loop)
derobert
+1  A: 

I suggest a short python script. Most shell tools will balk at that much input (though xargs may do the trick). Will update with example in a sec.

#!/usr/bin/python
import os, shutil

src_dir = '/src/dir'
dest_dir = '/dest/dir'

for fn in os.listdir(src_dir):
  os.makedirs(dest_dir+'/'+fn[0]+'/'+fn[1]+'/'+fn[2]+'/')
  shutil.copyfile(src_dir+'/'+fn, dest_dir+'/'+fn[0]+'/'+fn[1]+'/'+fn[2]+'/'+fn)
SpliFF
thanks, looks like a wonderful solution. I need to wait for the files to transfer to my new server before I can try it out (ETA 50 hours lol)
Dan
+3  A: 
for i in *.*; do mkdir -p ${i:0:1}/${i:1:1}/${i:2:1}/; mv $i ${i:0:1}/${i:1:1}/${i:2:1}/; done;

The ${i:0:1}/${i:1:1}/${i:2:1} part could probably be a variable, or shorter or different, but the command above gets the job done. You'll probably face performance issues but if you really want to use it, narrow the *.* to fewer options (a*.*, b*.* or what fits you)

edit: added a $ before i for mv, as noted by Dan

inerte
FYI, the `${i:0:1}` syntax is a bash-ism, which is probably OK on Linux, but just in case...
derobert
If there are a few directories in the folder, will this loop include them as well?
Dan
Needed one correction:for i in *.*; do mkdir -p ${i:0:1}/${i:1:1}/${i:2:1}/; mv $i ${i:0:1}/${i:1:1}/${i:2:1}/; done;
Dan
Only directories with dots in them!
Chris Huang-Leaver
+2  A: 

You can do it as a bash script:

#!/bin/bash

base=base

mkdir -p $base/shorts

for n in *
do
    if [ ${#n} -lt 3 ]
    then
        mv $n $base/shorts
    else
        dir=$base/${n:0:1}/${n:1:1}/${n:2:1}
        mkdir -p $dir
        mv $n $dir
    fi
done

Needless to say, you might need to worry about spaces and the files with short names.

notnoop
very nice solution, thank you
Dan
A: 

Any of the proposed solutions which use a wildcard syntax in the shell will likely fail due to the sheer number of files you have. Of the current proposed solutions, the perl one is probably the best.

However, you can easily adapt any of the shell script methods to deal with any number of files thus:

ls -1 | \
while read filename
do
  # insert the loop body of your preference here, operating on "filename"
done

I would still use perl, but if you're limited to only having simple unix tools around, then combining one of the above shell solutions with a loop like I've shown should get you there. It'll be slow, though.

Chris Cleeland
Wildcard syntax should be fine, its a shell built-in and its not being passed on the command line to a program on purpose (otherwise, surely, the command line would be too long). for i in `seq 1 1000000` works, for example.
derobert
I just tested: using `for f in *` works just fine with 1,000,000 files. Slow, but it works.
derobert
thanks for your commentary, it was helpful as I am very new to shell scripting
Dan
@derobert: thanks for testing that out and confirming that it *does* work. This is apparently a case where lessons learned from The Old Days no longer are necessarily true. Bash apparently improved that aspect. I know for a fact that it failed in various ways under the Bourne shell, but that was back in the late 80s/early 90s when I first made the mistake while writing a script to do some maintenance on NetNews directories.
Chris Cleeland