tags:

views:

66

answers:

1

The raw string is like this:

{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\froman\fcharset0 Times New Roman;}{\f1\fnil\fcharset0 MS Shell Dlg 2;}}
\viewkind4\uc1\pard\sb100\sa100\f0\fs24\u30340?\u27494?\u35013?\u20998?\u23376?\u65292?23\u26085?\u22312?\u33778?\u24459?\u23486?\u21335?\u37096?\u30340?\u39532?\u20140?\par
\pard\f1\fs17\par
by: lena (11/26/09)\par
\par
}

What is the regex pattern that would replace all RTF tags following a slash with "" empty string except \unumbers? The result should look like:

\u30340?\u27494?\u35013?\u20998?\u23376?\u65292?23\u26085?\u22312?\u33778?\u24459?\u23486?\u21335?\u37096?\u30340?\u39532?\u20140?
by: lena (11/26/09)

I tried "\\\\\\w+|\\{.*?\\}|\\}" which removes all that follows a backslash and all curly braces. The missing part is something like \\!(\\\\u)

Thanks.

A: 

Try matching the tags you want to keep first and replace them.

# php
$str = preg_replace('/(\\\u[\d]+)|\\\+[\w\?]+|{.*?}/', '$1', $str);

# perl
$str =~ s/(\\\u[\d]+)|\\\+[\w\?]+|{.*?}/$1/g;
Rob
I meant replacing them with themselves. The first match `(\\\u[\d]+)` is the \u tags you want to keep, which is the replacement $1.
Rob
sorry confused a bit:if i coding in c++ and not familiar with php or perl that much.
val
ah, Ok.thanks Rob, i will try that
val