ansaurus

Question

Answer 1

A:

I'm afraid, Notepad++ Regex cannot do that

Notepad++ using Scintilla regex engine, its per line based, so multiline search / replace cannot be done.

Note that \r and \n are never matched because in Scintilla, regular expression searches are made line per line (stripped of end-of-line chars).

S.Mark 2010-04-21 08:45:32

Answer 2

A:

I like Notepad++ too but the regexing is really a pain. If you insist on using Notepad++ try this:

First find out which newline characters are being used in your document (View>Show Symbol>Show End Of Line)
Delete those line-breaks by replacing them with a single space (Search and replace. CR is \r LF is \n. Be sure to tick "Extended" search mode)
Regex-replace done[0-9][0-9]*=\"[0-9][0-9]*\" with the empty string (be sure to put a single space before the regex expression)

Voila! Not very nice n clean but it works ;o)

After that if you want it human-readable again you could use the HTMLTidy functions

das_weezul 2010-04-21 09:03:11

Answer 3

A:

You almost had it! Unfortunately, the complete solution in Notepad++ would have to be a 3 step process.

Regex search/replace with the following search: \<done[0-9]+="[0-9]+"[ ]* Of course, leave the replace field empty, so that it will simply delete everything that matches. (In Notepad++ understanding of regular expressions \< represents the "beginning of a word".)
Select the portion of text affected by your previous search/replace. You don't want to select the entirety of your document, because we're going to...
Strip newlines. Hit Ctrl-F to bring up the Search/Replace dialog again and this time select "Extended" search mode, instead of "Regular expression". Depending on the format of your document you are going to want to search for either \n or \r\n. The replacement field should, again, be empty. Also, make sure that the "In Selection" checkbox is checked.

Click "Replace All" and you're done!

kurige 2010-04-21 09:22:06

Answer 4

+3 A:

Extended Replace "\n" with "LINEBREAK "

Thanks a lot to all for these timely replies. Following your advices, here's what I did:

"Notepad++ > View > Show Symbol > Show End Of Line" shows "CR+LF" at each line end.
"Notepad++ > Search > Find", "Search mode" = "Normal", made sure that "Find what" = "LINEBREAK" finds nothing
"Search mode" = "Extended", "Find what" = "\n\r" only finds the double-breaks (CR + LF + a blank line); "\n \r" find nothing; yet "\n" does find exactly all line breaks, and only them.
Saving my "Towncar.htm" test file as "Towncar_02.htm" (also encoded in ANSI)
Under "Extended", replaced all "\n" with "LINEBREAK " (notice the trailing space)
Under "Regular expression", replaced each occurrence of:
```
 done[0-9]*="[0-9]*"
```

(Be careful to check there is THE HEADING SPACE before "done"
and there is NO TRAILING SPACE! see below)

with an empty string

Under "Extended", replaced each occurrence of "LINEBREAK" with "\n" (no trailing space this time after "LINEBREAK"!)
Checked that the resulting "Towncar.htm" file (after a few cosmetic reformatting) looked OK and pretty, and that after refresh, it still rendered the same as the "Towncar_02.htm" backup.

Recalls and Notes:

This forum apparently works well in Chrome 4; but with some browsers (e.g. IE6 and other discontinued ones), under some circumstances, it causes some artifacts; so, be careful:
even if the forum doesn't show it in your browser, there is a heading space, i.e. at the beginning of the Regex (the " done..." Regular expression above) and inside it, so to replace only strings starting with " done", with the starting space, thus making even surer to NOT alter eventual other strings with "undone" or "methadone" or else
same way, even if the forum shows one in your browser, there is no trailing space at the end of the Regex!
in the Regex, [0-9] matches 1 and only 1 occurrence of any decimal digit (characters in the 0-9 range); IOW it matches « 0 » or « 1 » or « 9 » etc, but NOT « 01 » or « 835 » or « » (the empty string) or whichever.
* (asterisk) matches 0 or more times the previous character (here it matches the empty string or any string made exclusively of digits)
samewise, + (plus sign) matches 1 or more times the previous character (here it matches any string, at least 1 character long, made exclusively of digits)
Ref: http://sourceforge.net/apps/mediawiki/notepad-plus/index.php?title=Regular_Expressions#Notepad.2B.2B_regex_syntax

Again a lot of thanks to all 3!

Versailles, Wed 21 Apr 2010 16:20:45 +0200,
edited (correcting small display errors) Fri 23 Apr 19:47

Michel Merlin 2010-04-21 14:20:45

Wow!, this is very well-written and detailed answer, Impressive! I've voted on both question and answer.

S.Mark 2010-04-21 14:27:21

Finding Line Beginning using Regular expression in Notepad++