ansaurus

Question

GREP - finding all occurences of a string

Answer 1

A:

I would use sed, not grep! Sed is used to perform basic text transformations on an input stream. Try s/regexp/replacement/ option with sed command.

You can also try awk command. It has an option -F for fields separation, you can use it with ; to separate lines of you files with ;.

The best solution will be however a simple script in Perl or in Python.

psihodelia 2009-11-23 20:48:43

sed is what I ended up using. In fact it is very easy to use and once I figured out what regular expression I needed, everything fell into place.I simply daisy-chained my commands togethersed -e s/regexp/replacement/ -e ... -e ... | grep SOME_PATTERN > occurrences

2009-12-28 13:38:27

Answer 2

+1 A:

To address your concern about missing some occurrences, why not filter progressively:

Create a text file with all possible matches as a starting point.
Use filter X (grep for '^import', for example) to dump probable false positives into a tmp file.
Use filter X again to remove those matches from your working file (a copy of [1]).
Do a quick visual pass of the tmp file and add any real matches back in.
Repeat [2]-[4] with other filters.

This might take some time, of course, but it doesn't sound like this is something you want to get wrong...

grossvogel 2009-11-23 21:07:01

sounds like a possible winner.I was hoping to find a regular expression that was the magic/easy button.

2009-11-23 21:20:03

I guess the question is what's more valuable to you: wasting an hour manually looking for possible false positives, or wasting an hour getting ripped a new one by your boss because your über-clever regexp missed some crazy convoluted corner case in the Java Language Specification.

Jörg W Mittag 2009-11-23 23:41:19

I came from a mechanical engineering background, so I am aware that mistakes will occur ... I am trying to choose the path that will yield fewer mistakes and better results that are reproducible.A computer can do repetitive tasks without problem, humans on the other hand ... That is why computers exist.I can always tweak my regular expression, it only takes a minute to run; however, manually evaluating this can take days or weeks for the amount of content I'd have to go through and after a day or a few hours, I'm sure I might skip an occurrence or two here and there.

2009-12-07 14:01:38

ansaurus

tags:

views:

answers:

GREP - finding all occurences of a string

related questions