I want to remove dots in acronyms but not in domain names in a python string. For example, I want the string
'a.b.c. [email protected] http://www.test.com'
to become
'abc [email protected] http://www.test.com'
The closest regex I made so far is
re.sub('(?:\s|\A).{1}\.',lambda s: s.group()[0:2], s)
which results to
'ab.c. [email protected] http://www.test.com'
It seems that for the above regex to work, I need to change the regex to
(?:\s|\A|\G).{1}\.
but there is no end of match marker (\G) in python.
EDIT: As I have mentioned in my comment, the strings have no specific formatting. These strings contain informal human conversations and so may contain zero, one or several acronyms or domain names. A few errors is fine by me if it would save me from coding a "real" parser.