tags:

views:

173

answers:

4

Hello,

I am working with a data set that needs to be scrubbed. I am looking to replace the question marks(?) with the em-dash code(—). Here is an example string:

"...shut it down?after taking a couple of..."

I can match that instance with this expression: \w\?\w However, it matches one character on either side of the question mark. So the replace looks like this:

"...shut it dow—after taking a couple of..."

How can I match just the pattern while only replacing the question mark?

Thanks in advance, Jason

+3  A: 

If it is PHP (I'm basing that on other questions you have asked), this should do it:

$str = preg_replace('/(\w)\?(\w)/i', '\\1—\\2', $str);
Sean Bright
Yes, this particular instance I was working in php and that match works perfectly! Thanks!
JasonBartholme
+2  A: 

Hard to answer if we don't know which technology are you using. If you are writing a JS this will do it

inputStr.replace(/(\w)\?(\w)/, '$1—$2');
RaYell
+3  A: 

If the language you are using supports lookarounds, you could use them to make sure your question mark is surrounded by word characters, but not actually capture them:

/(?<=\w)\?(?=\w)/

The (?<=\w) is a lookbehind (the engine looks "behind" -- before -- a potential match) and the (?=\w) is a lookahead (the engine looks ahead). Lookarounds are not captured, so in your case, only the question mark will be, and then you can replace it.

In PHP, for example, you could thus do:

$string = "...shut it down?after taking a couple of..."
preg_replace('/(?<=\w)\?(?=\w)/', "&mdash;", $string);
// results in ...shut it down&mdash;after taking a couple of...

Lookarounds are supported by PCRE-based (perl compatible) regular expression engines, although Ruby doesn't support lookbehinds.

Daniel Vandersluis
This works just as well as Sean Bright's pattern. Php does support lookarounds and I will try to implement this method in other patterns I will need to match soon.
JasonBartholme
+2  A: 

Use: /\b\?\b/

\b matches word boundaries, which seems to be what you're after.

Paul Biggar