views:

359

answers:

4

I have a bunch of files that look like this:

A.File.With.Dots.Instead.Of.Spaces.Extension

Which I want to transform via a regex into:

A File With Dots Instead Of Spaces.Extension

It has to be in one regex (because I want to use it with Total Commander's batch rename tool).

Help me, regex gurus, you're my only hope.

Edit

Several people suggested two-step solutions. Two steps really make this problem trivial, and I was really hoping to find a one-step solution that would work in TC. I did, BTW, manage to find a one-step solution that works as long as there's an even number of dots in the file name. So I'm still hoping for a silver bullet expression (or a proof/explanation of why one is strictly impossible).

+1  A: 

Basically:

/\.(?=.*?\.)//

will do it in pure regex terms. This means, replace any period that is followed by a string of characters (non-greedy) and then a period with nothing. This is a positive lookahead.

In PHP this is done as:

$output = preg_replace('/\.(?=.*?\.)/', '', $input);

Other languages vary but the principle is the same.

cletus
I don't think the ? after the .* is necessary... it will just backtrack anyway. *some* regex engines *might* be able to optimize it a little better, but [^.]* instead would work optimally even for a really unoptimized regex engine.
ʞɔıu
Generally speaking you probably want to avoid backtracking if you can.
cletus
most regex engines will backtrack regardless of the ? being there or not, it's just a matter of how far they will backtrack.
ʞɔıu
A: 

You can do that with Lookahead. However I don't know which kind of regex support you have.

/\.(?=.*\.)//

Which roughly translates to Any dot /\./ that has something and a dot afterwards. Obviously the last dot is the only one not complying. I leave out the "optionality" of something between dots, because the data looks like something will always be in between and the "optionality" has a performance cost.

Check: http://www.regular-expressions.info/lookaround.html

eipipuz
+3  A: 

It appears Total Commander's regex library does not support lookaround expressions, so you're probably going to have to replace a number of dots at a time, until there are no dots left. Replace:

([^.]*)\.([^.]*)\.([^.]*)\.([^.]*)$

with

$1 $2 $3.$4

(Repeat the sequence and the number of backreferences for more efficiency. You can go up to $9, which may or may not be enough.)

It doesn't appear there is any way to do it with a single, definitive expression in Total Commander, sorry.

molf
There's no need to escape dots within character classes: ([^.]*)\.([^.]*)\.([^.]*)\.([^.]*)$
Helen
+1  A: 

Here's one based on your almost-solution:

/\.([^.]*(\.[^.]+$)?)/\1/

This is, roughly, "any dot stuff, minus the dot, and maybe plus another dot stuff at the end of the line." I couldn't quite tell if you wanted the dots removed or turned to spaces - if the latter, change the substitution to " \1" (minus the quotes, of course).

[Edited to change the + to a *, as Helen's below.]

John Hyland