views:

110

answers:

2

Hi,

I need to get rid of trailing repeating non_alphanumeric symbols like

"thanks ! !!!!!!!" to "thanks !"

If these different symbols then they are ignored.

I quite new to regex, so I came up with

regexp_replace('Thanks . . . .  ', '((\\W)\\s*)(\\2\\s*)+', '\\2')

to try it out.

However i realise the trailing space after 'thanks' causes some problem. I would get back "thanks " instead of "thanks ." This is kinda strange because I used an online regex tool and the first whitespace was not matched. Can anyone help?

note: I did insert the double backslash.

+2  A: 

Replace

(\W)(\s*\1)+

with

\1

I don't know PostgreSQL, but from your example, I'm guessing:

regexp_replace('Thanks . . . .  ', '(\\W)(\\s*\\1)+', '\\1')

This will also replace leading multiple spaces with a single space. If you don't want that (i. e. if you want leading spaces to be left alone entirely), then use

([^\s\w])(\s*\1)+   // '([^\\s\\w])(\\s*\\1)+'

instead.

Tim Pietzcker
That's how I would do it, too (except I would add another `\s*` to the end to take care of trailing whitespace). But the OP's regex should work--it works fine in RegexBuddy in "Tcl ARE/PostgreSQL" mode.
Alan Moore
@Alan,@Tim, works great thanks!
goh
A: 

try like this:

select regexp_replace('Thanks ! !!!!!!!!', '(\\s*)((\\W)\\s*)(\\2\\s*)+', '\\1\\2');

result:

Thanks !
tinychen