ansaurus

Question

Remove periods from the end of markdown paragraphs

Answer 1

+1 A:

(?<!\.[a-zA-Z]|etc|\.\.)\.(?=\n{2,}|\Z)

(?<!\.[a-zA-Z]|etc|\.\.) - lookbehind to make sure that the period is not preceded by sequences like .T, etc, .. (for ellipsis).
\. the period
(?=\n{2,}|\Z) lookahead to look for end of a markdown paragraph (two newlines or end of string)

Test:

s = """ths is a paragraph.

this ends with an ellipsis...

this ends with etc.

this ends with B.I.G.

this ends with e.g.

this should be replaced.

this is end of text."""
print s.gsub(/(?<!\.[a-zA-Z]|etc|\.\.)\.(?=[\n]{2,}|\Z)/, "") 
print "\n"

Output:

this is a paragraph

this ends with an ellipsis...

this ends with etc.

this ends with B.I.G.

this ends with e.g.

this should be replaced

this is end of text

Amarghosh 2010-07-20 04:27:42

Perfect! (Only my version of Ruby (1.8.7) doesn't support lookbehinds! Argh!)

Horace Loeb 2010-07-20 05:55:20

@Horace 1.9.1p129 does.

Amarghosh 2010-07-20 06:05:39

Is there any way to do this without a lookbehind? Even with more than 1 regular expression (I can't upgrade Ruby right now)?

Horace Loeb 2010-07-20 15:50:31

@Horaz I haven't tested this; but you can replace `(\.[a-zA-Z]|etc|\.\.)\.(?=\n{2,}|\Z)` with `"\\1"`

Amarghosh 2010-07-22 12:53:48

Close, but it does the *opposite* of what we want (i.e., removes periods when part of ellipses, acrynoms, etc). See http://pastie.org/1056316 (what does `"\\1"` mean?)

Horace Loeb 2010-07-23 00:25:03

Answer 2

A:

A Ruby 1.8.7 compatible algorithm:

s = %{this is a paragraph.

this ends with an ellipsis...

this ends with etc.

this ends with B.I.G.

this ends with e.g.

this should be replaced.

this is end of text.}.strip

a = s.split(/\n{2,}/).each do |paragraph|
  next unless paragraph.match /\.\Z/
  next if paragraph.match /(\.[a-zA-Z]|etc|\.\.)\.\Z/
  paragraph.chop!
end.join("\n\n")

>> puts a
this is a paragraph

this ends with an ellipsis...

this ends with etc.

this ends with B.I.G.

this ends with e.g.

this should be replaced

this is end of text

Horace Loeb 2010-07-27 20:57:50

ansaurus

tags:

views:

answers:

Remove periods from the end of markdown paragraphs

related questions