ansaurus

Question

Sorting files into directories based on their name

Answer 1

A:

The -n option to cp is very useful in situations like this. It lets you not worry if a file is already in the destination.

-n, --no-clobber
   do not overwrite an existing file (overrides 
   a previous -i option)

This basically makes the case you talk about where you do the same work twice go away. You can split your concerns into moving all the files and only moving files that haven't been moved before.

Paul Rubel 2010-10-05 17:00:13

Thanks, I've added that to the script. While this does improve the situation, it still seems rather ugly. cp still needs to do several thousands of checks on individual files. Then again, I don't know if it's much faster to do the checking in bash.

Claes 2010-10-06 07:09:37

Answer 2

+1 A:

If the value in $dirkey contains alpha characters you'll have to use an associative array which isn't available before Bash 4. If you're using Bash 4 and the keys are alphanumeric rather than simply numeric, add the following at the top of your script:

declare -A copied

Additional comments:

You're using parameter expansion in some places and sed in others. You could use brace expansion in (perhaps) all cases.

I would recommend instead of doing quoting like $var"literal"$var, do it like "${var}literal${var}" or in cases where the literal will not be ambiguously interpreted as part of the variable name you can omit the braces: "literal$var".

Use variable passing with awk instead of complex "'" quoting: awk -v awkvar=$shellvar '{print awkvar}'.

Calling external executables in a loop can slow things down quite a lot, especially if it's only dealing with one value (or line of data) at a time. The 'sedcommands that I mentioned are examples of this. Also, yourawk` command may be able to be converted to parameter expansion form.

GNU find has a regex feature that you could use instead of grep.

All variable names which contain filenames should be quoted.

Dennis Williamson 2010-10-05 17:43:49

I use bash 4, but haven't used associative arrays before, the declare was new to me, and seems to have made all the difference, thank you! I will correct my variables to use proper quoting. I actually tried -v with awk initially, but it failed to work for reasons I couldn't figure out. I will see about replacing that ugly sed too. I wasn't aware I could use regex with find. I can't quite get it work either, but if I drop the $, the -name accepts my regex. You post has been most informative. Thank you again.

Claes 2010-10-06 07:03:37

ansaurus

tags:

views:

answers:

Sorting files into directories based on their name

related questions