tags:

views:

133

answers:

2

Hello

I have a directory with lots of folders, subfolder and all with files in them. The idea of my project is to recurse through the entire directory, gather up all the names of the files and replace invalid characters (invalid for a SharePoint migration).

However, i'm completely unfamiliar with Regular Expressions. The characters i need to get rid in filenames are: ~, #, %, &, *, { } , \, /, :, <>, ?, -, | and "" I want to replace these characters with a blank space. I was hoping to use a string.replace() method to look through all these file names and do the replacement.

So far, the only code i've gotten to is the recursion. I was thinking of the recursion scanning the drive, fetching the names of these files and putting them in a List.

Can anybody help me with how to find/replace invalid chars with RegEx with those specific characters?

+6  A: 
string pattern = "[\\~#%&*{}/:<>?|\"-]";
string replacement = " ";

Regex regEx = new Regex(pattern);
string sanitized = Regex.Replace(regEx.Replace(input, replacement), @"\s+", " ");

This will replace runs of whitespace with a single space as well.

Vivin Paliath
`string pattern = "[\\~#%` is better - less unnecessary escaping.
Tim Pietzcker
@Tim thanks! I will edit my solution. Most of my regex experience is in Perl where I use regex literals. So I'm not entirely sure what needs to be escaped and what doesn't in C# or Java. It's mostly trial-and-error.
Vivin Paliath
I just noticed that yeahumok wanted to replace the invalid characters with a space, not the empty string. I have removed the `+` from my version again, expecting that he wants one space for each invalid character, even if there are several in a row.
Tim Pietzcker
@yeahumok check out my edited solution
Vivin Paliath
i noticed when i ran this though, it did NOT rename the files themselves. perhaps it was my fault in not being clear...but is there any way to change the actual file name itself??
@yeahumok - I would ask a separate question regarding changing filenames via C#.
Vivin Paliath
+1  A: 

is there a way to get rid of extra spaces?

Try something like this:

string pattern = " *[\\~#%&*{}/:<>?|\"-]+ *";
string replacement = " ";

Regex regEx = new Regex(pattern);
string sanitized = regEx.Replace(input, replacement);

Consider learning a bit about regular expressions yourself, as it's also very useful in developing itself (e.g. search/replace in Visual Studio).

Michel de Ruiter
also, is there any way to remove any extraneous '.' (periods) in a filename? for example: 0.0.0.1.doc How would i handle this w/o it wiping out the .doc?