views:

54

answers:

1

I don't have much experience with regexes and I wanted to rectify that. I decided to build an application that takes a directory name, scans all files (that all have a increasing serial number but differ subtly in their filenames. Example : episode01.mp4, episode_02.mp4, episod03.mp4, episode04.rmvb etc.)

The application should scan the directory, find the number in each file name and rename the file along wit the extension to a common format (episode01.mp4,episode02.mp4,episode03.mp4,episode04.rmvb etc.).

I have the following code:

Dictionary<string, string> renameDictionary = new Dictionary<string,string>();
DirectoryInfo dInfo = new DirectoryInfo(path);
string newFormat = "Episode{0}.{1}";
Regex regex = new Regex(@".*?(?<no>\d+).*?\.(?<ext>.*)"); //look for a number(before .) aext: *(d+)*.*
foreach (var file in dInfo.GetFiles())
{
  string fileName = file.Name;
  var match = regex.Match(fileName);
  if (match != null)
  {
    GroupCollection gc = match.Groups;
    //Console.WriteLine("Number : {0}, Extension : {2} found  in {1}.", gc["no"], fileName,gc["ext"]);
    renameDictionary[fileName] = string.Format(newFormat, gc["no"], gc["ext"]);
  }
}
foreach (var renamePair in renameDictionary)
{
  Console.WriteLine("{0} will be renamed to {1}.", renamePair.Key, renamePair.Value);
  //stuff for renaming here
}

One problem in this code is that it also includes files which don't have numbers in the renameDictionary. It would also be helpful if you could point out any other gotchas that I should be careful about.

PS: I am assuming that the filenames will only contain numbers corresponding to serial (nothing like cam7_0001.jpg)

+1  A: 

This simplest solution is probably to use Path.GetFileNameWithoutExtension to get the file name, and then the regex \d+$ to get the number at its end (or Path.GetExtension and \d+ to get the number anywhere).

You can also achieve this in a single replace:

Regex.Replace(fileName, @".*?(\d+).*(\.[^.]+)$", "Episode$1$2")

This regex is a bit better, in that it forces the extension not to contain dots.

Kobi