I have a a large text file (over 70mb) and need to count the number of times a character sequence occurs in the file. I can find plenty of scripts to do this, but NONE OF THEM take in to account that a sequence can start and finish on different lines. For the sake of efficiency (I actually have way more than 1 file I am processing), I can not preprocess the files to remove newlines.
Example: If I am searching for "thisIsTheSequence", the following file would have 3 matches:
asdasdthisIsTheSequence
asdasdasthisIsT
heSequenceasdasdthisIsTheSequ
encesadasdasda
Thanks for the help.