views:

726

answers:

4

I have been fighting this problem with the help of a RegEx cheat sheet, trying to figure out how to do this, but I give up... I have this lengthy file open in Notepad++ and would like to remove all lines that do not start with a digit (0..9). I would use the Find/Replace functionality of N++. I am only mentioning this as I am not sure what Regex implementation is N++ using... Thank you

Example. From the following text:

1hello
foo
2world
bar
3!

I would like to extract

1hello
2world
3!

not:

1hello

2world

3!

by doing a find/replace on a regular expression.

+3  A: 

[^0-9] is a regular expression that matches pretty much anything, except digits. If you say ^[^0-9] you "anchor" it to the start of the line, in most regular expression systems. If you want to include the rest of the line, use ^[^0-9].+.

unwind
also worked but left a whole lot of blank lines. How can I capture the line break too?
Peter Perháč
Have you tried adding `[\r\n]*` at the end of your expression?
PP
It looks like this does work only in "extended mode" in np++, but not in regex-mode.
moxn
no change to the final effect. blank lines still
Peter Perháč
+2  A: 

^[^\d].* marks a whole line whose first character is not a digit. Check if there are really no whitespaces in front of the digits. Otherwise you'd have to use a different expression.

UPDATE: You will have to do ot in two steps. First empty the lines that do not start with a digit. Then remove the empty lines in extended mode.

moxn
this worked as far as finding all lines not starting with a digit, but when I did a search/replace, searching for ^[^\d].* replacing it with nothing, I am still left with a lot of blank lines. How would I have your regex capture the line break too?
Peter Perháč
You could try first to remove all lines with digits. And then you could switch to the "extended mode". I tested it and it works for me to find linebreaks here with `\r\n`. Replace them with nothgin then.
moxn
yeah, that's a time-tested way to do it, but I was hoping this could all be done in a single step. Maybe if I were using some other editor, but it's npp, and so I'll be content with doing this in two steps :) cheers
Peter Perháč
+1  A: 

I'm not sure what you are asking. but the reg exp for finding the lines with a digit at the beginning would be ^\d.* you can remove all the lines that match the above or alternatly keep all the lines that match this expression: ^[^\d].*

michelle
I am trying to figure out which part of my question is not clear. I would edit it, but I think I am quite clear in asking how to `remove lines not starting with a digit`?
Peter Perháč
what was not clear to me was how are you trying to remove. now I understand you are using search and replace. try to search for ^[^\d].* and replace with \b (which is the backspace character or alternativley *and this worked for me in the past)search for ^[^\d].*\R and replace with nothing (the R must be capital!) <br><br>if the latter is "greedy" as in, it deletes all the lines after the first match then you can try replacing ^[^\d][^\R]*\R woth nothing
michelle
+2  A: 

You can clear up those line with ^[^0-9].* but it will leave blank lines.

Notepad++ use scintilla, and also using its regex engine to match those.

\r and \n are never matched because in Scintilla, regular expression searches are made line per line (stripped of end-of-line chars).

http://www.scintilla.org/SciTERegEx.html

To clear up those blank lines, only way is choose extended mode, and replace \n\n to \n, If you are in windows mode change \r\n\r\n to \r\n

S.Mark
Oh, okay, I think I will be satisfied by this explanation.
Peter Perháč