tags:

views:

486

answers:

4

I'm trying to add a space before every capital letter, except the first one.

Here's what I have so far, and the output I'm getting:

>>> tex = "ThisFileName.txt"
>>> re.sub('[A-Z].', ' ', tex)
' his ile ame.txt'

I want: 'This File Name.txt'

(It'd be nice if I could also get rid of .txt, but I can do that in a separate operation.)

+10  A: 

Key concept here is backreferences in regular expressions:

import re
text = "ThisFileName.txt"
print re.sub('([a-z])([A-Z])', r'\1 \2', text)
# Prints: "This File Name.txt"

For pulling off the '.txt' in a reliable way, I recommend os.path.splitext()

import os
filename = "ThisFileName.txt"
print os.path.splitext(filename)
# Prints: ('ThisFileName', '.txt')
Triptych
+2  A: 
re.sub('([a-z])([A-Z])', '\\1 \\2', 'TheFileName.txt')

EDIT: StackOverflow eats some \s, when not in 'code mode'... Because I forgot to add a newline after the code above, it was not interpreted in 'code mode' :-((. Since I added that text here I didn't have to change anything and it's correct now.

Johannes Weiß
add another backslash: re.sub('([a-z])([A-Z])', '\\1 \\2', text)...the one is interpreted as an escape sequence...from the documentation: Unless an 'r' or 'R' prefix is present, escape sequences in strings are interpreted according to rules similar to those used by Standard C.
Adam Bernier
downvoting until the error is fixed...
Triptych
was not really my fault I think, fixed I anyhow. (See my EDIT: comment)
Johannes Weiß
ahh: and thanks for notifying me. I would not have looked at this post again.
Johannes Weiß
um - you didn't revenge-downvote me did you?
Triptych
I didn't downgrade you. I didn't even up/downgrade anybody in that thread...
Johannes Weiß
+2  A: 

Another possible regular expression using a look behind:

(?<!^)([A-Z])
Gumbo
+1  A: 

It is not clear what you want to do if the filename is Hello123There.txt. So, if you want a space before all capital letters regardless of what precedes them, you can:

import re

def add_space_before_caps(text):
    "Add a space before all caps except at start of text"
    return re.sub(r"(?<!^)(?=[A-Z])", " ", text)

>>> add_space_before_caps("Hello123ThereIBM.txt")
'Hello123 There I B M.txt'
ΤΖΩΤΖΙΟΥ