views:

66

answers:

3

The National Park Service's Natural Sounds Program collects multiple terabytes of data each year measuring soundscapes. In your opinion, what is best available scripting language to manage massive amounts of files and file types? We would like to easily design and run efficient user-friendly scripts to search for and retrieve/create copies of files that may be located in different directories according a single static hierarchy. The OS will most likely be windows. Thanks!

+4  A: 

Use the one your developers are most familiar with. The productivity gains you'll get from that will almost certainly beat out any advantages that one language may have over another.

Eric Petroelje
+1 go with what you know. It often works out best.
S.Lott
A: 

I don't really know what your are going to be looking for in a scripting language, but Eric is right that you should use something all your developers are familiar with. However, if you don't have developers (yet) and are designing the project (and team) from the ground up, C++ or .Net (C# or VB).

While C++ offers more powerful programming and performance, C# and VB.Net offer quicker production. Regardless of .Net's production advantage, I would think that for massive amounts of files & file types, you will have the best overall satisfaction from C++. In my opinion, the best user friendly design requires very little user input other than clicking buttons or selecting options from a list.

dboarman
+1  A: 

Use Python. It's easy to learn. Everyone can easily convert.

The size of the files doesn't much matter when you're searching directories or searching for metadata outside the files. Even so, you rarely need to read an entire sound sample file to strip off the metadata.

Also, if you're doing this frequently, you might want to consider

  1. Extract all the metadata to a relational database.

  2. Use the relational database as a complex "index" to the sound sample files.

Each file add or change would be done through an application that synchronized file changes with database updates to assure that the database index actually matches the filesystem.

The bulk of your searches might become SQL queries.

S.Lott