tags:

views:

105

answers:

3

Okay so i have a semi weridish problem with re.sub.

Take the following code:

import re
str_to_be_subbed = r'somefile.exe -i <INPUT>'
some_str = r'C:\foobar'
s = re.sub(r'\<INPUT\>', some_str, str_to_be_subbed)
print s

I would think it would give me:

somefile.exe -i C:\\foobar

But instead it gives me:

somefile.exe -i C:♀oobar

I know \f is an escape char, but even if i try to do it this way, which should escape the special characthers. Even if i do this:

print r'%s' % s

It still gives me this:

somefile.exe -i C:♀oobar

Why does it do this? And whats the best way to avoid this?

Ninja Edit:

If i look at the value of s it is:

'somefile.exe -i C:\x0coobar'

Why did \f turn into \x0. Ugh.

Edit:

One more question, if i modify the code to this:

import re
import os
str_to_be_subbed = r'somefile.exe -i <INPUT>'
some_str = os.path.abspath(r'C:\foobar')
some_str
s = re.sub(r'\<INPUT\>', some_str, str_to_be_subbed)
print s

Gives me:

>>> import re
>>> import os
>>> str_to_be_subbed = r'somefile.exe -i <INPUT>'
>>> some_str = os.path.abspath(r'C:\foobar')
>>> some_str
'C:\\foobar'
>>> s = re.sub(r'\<INPUT\>', some_str, str_to_be_subbed)
>>> print s
somefile.exe -i C:♀oobar

Now why is that. Since os.path.abspath escapes the \'s. Why does re.sub still mess up?

Also this is a really small example, of this little problem that just cropped up randomly in an oldish application. I can't really go and change this to string.replace because of how its setup without spending a whole day.

I am just really curious as to why it is doing this. For the above problem.

+3  A: 

\f is the form feed character. Escape it and it works:

some_str = r'C:\\foobar'

Another solution:

s = re.sub(r'<INPUT>', some_str.encode("string_escape"), str_to_be_subbed)
Gumbo
Thank you.Encode works amazingly.
UberJumper
+3  A: 

Don't use regular expressions:

print str_to_be_subbed.replace("<INPUT>",some_str)

As the documentation says:

repl can be a string or a function; if it is a string, any backslash escapes in it are processed.

Douglas Leeder
I cannot use string replace. I have to use the regex form otherwise i will be forced to change/modify hundreds of lines of someone elses code.
UberJumper
A: 

Your example doesn't need regexps, use str.replace():

>>> str_to_be_subbed.replace('<INPUT>',some_str)
'somefile.exe -i C:\\foobar'
>>>
gimel