tags:

views:

70

answers:

2

what's best solution using regex, to remove special characters from the begin and the end of every word.

"as-df-- as-df- as-df (as-df) 'as-df' asdf-asdf) (asd-f asdf' asd-f' -asdf- %asdf%s asdf& $asdf$ +asdf+ asdf++ asdf''"

the output should be:

"as-df-- as-df- as-df (as-df) as-df asdf-asdf) (asd-f asdf' asd-f' asdf %asdf%s asdf& asdf asdf asdf++ asdf''"

if the special character at the begin match with the end, remove it

i am learning about regex. [only regex]

+1  A: 

For Perl, how about /\b([^\s\w])\w+\1\b/g? Note things like \b don't work in all regex languages.

Oops, as @Nick pointed out, this doesn't work for non-identical pairs, like () [] etc.

Instead you could do:

 s/\b([^\s\w([\]){}])\w+\1\b/\2/g
 s/\b\((\w+)\)\b/\1/g
 s/\b\[(\w+)\]\b/\1/g
 s/\b\{(\w+)\}\b/\1/g

(untested)

LarsH
+1  A: 
import re
a = ("as-df-- as-df- as-df (as-df) 'as-df' asdf-asdf) (asd-f"
     "asdf' asd-f' -asdf- %asdf%s asdf& $asdf$ +asdf+ asdf++ asdf''")
b = re.sub(r"((?<=\s)|\A)(?P<chr>[-()+%&'$])([^\s]*)(?P=chr)((?=\s)|\Z)",r"\3",a)
print b

Gives:

as-df-- as-df- as-df (as-df) as-df asdf-asdf) (asd-f
asdf' asd-f' asdf %asdf%s asdf& asdf asdf asdf++ asdf''

Getting non-identical characters to work is tricker (), [], {}

Nick T
this doesn't do what OP want, ie `asdf-asdf)` must not be altered
M42
@M42 somewhat fixed, with a caveat
Nick T
`(` and `)` are different characters.
Nick T