views:

333

answers:

2

I am trying to create a batch file that will edit a text file to remove lines that contain a certain string and remove the line directly after that. An example of this file would look like this:

LINE ENTRY KEEP_1 BLA BLA
END
LINE ENTRY REMOVE_1 FOO BAR
END
LINE ENTRY REMOVE_2 HELLO WORLD
END
LINE ENTRY KEEP_2 CAT DOG
END

After running the batch script I require the new file to contain

LINE ENTRY KEEP_1 BLA BLA
END
LINE ENTRY KEEP_2 CAT DOG
END

where any line containing REMOVE_ has been deleted, as well as the corresponding 'END' line.

I have tried using the technique found here to remove the lines containing the string but it does not appear to be possible to include characters such as \r\n to check for and include the 'END' in the search. I can't do this as 2 seperate FINDSTR commands as I still require the 'END' text to be kept for the other two entries.
Using findstr /v REMOVE_ leaves me with the following:

LINE ENTRY KEEP_1 BLA BLA
END
END
END
LINE ENTRY KEEP_2 CAT DOG
END

and using findstr /v "REMOVE_*\r\nEnd" does not seem to work at all. Just to confirm each line is definitely terminated with \r\n.

Any help on this issue would be greatly appreciated.

A: 

findstr operates line-wise. You cannot do anything with it that spans more than a single line.

In any case, you're in for a world of pain if you do this with batch files. While you certainly can loop through the file and only output certain lines, this would look kinda like the following:

set remove=
for /f %%x in (file.txt) do (
  if not defined remove (
    echo %%x|findstr "REMOVE" >nul 2>&1 && set remove=1
    if not defined remove echo.%%x
  ) else (
    set remove=
  )
)

(untested, but might work). The problem here is twofold: for /f removes any empty lines from the output so if your file had them before you won't have them afterwards. This may or may not be a problem for your specific case. Another problem is that dealing with special characters can get hairy. I give no guarantee that the above works as it should for things like >, <, &, |, ...

Your best bet in this case, if you need to run it on almost any Windows machine, would probably be a VBScript. The string handling capabilities are much more robust there.

Joey
`grep` can take `-A` or `-B` to include lines after and/or before!
mvds
Fine, `findstr` cannot.
Joey
you started about `grep`. Just trying to help dude. Relax.
mvds
+1  A: 

The following batch script should do what you want:

@echo off
setlocal enabledelayedexpansion

set /A REMOVE_COUNT=1

if "%~2"=="" (
    echo Usage: %~n0 search_str file
    echo remove lines that contain a search_str and remove %REMOVE_COUNT% line^(s^) directly after that
    exit /b 1
)

set "SEARCH_STR=%~1"
set "SRC_FILE=%~2"

set /A SKIP_COUNT=0
for /F "skip=2 delims=[] tokens=1,*" %%I in ('find /v /n "" "%SRC_FILE%"') do (
    if !SKIP_COUNT! EQU 0 (
        set SRC_LINE=%%J
        if defined SRC_LINE (
            if "!SRC_LINE:%SEARCH_STR%=!" == "!SRC_LINE!" (
                echo.!SRC_LINE!
            ) else (
                set /A SKIP_COUNT=%REMOVE_COUNT%
            )
        ) else (
            rem SRC_LINE is empty
            echo.
        )
    ) else (
        set /A SKIP_COUNT-=1
    )
)

The number of lines to be removed after a matched line can be configured by setting the REMOVE_COUNT variable.

The script also handles files with empty lines correctly by using a trick: The find command is used to prefix all lines with line numbers. That way the for command will not skip empty lines.

sakra