views:

36

answers:

2

Greetings,

I have one file - more or less a greylisting file. I need to compare the 40 to 50 values in it against a whitelisting file - and remove any values from the greylist that exists in the whitelist file.

Right now I'm taking each greylist value and comparing it against each value in the whitelisting file (which has 1 - 2 thousand values) and removing it from the greylisting if I find a match. Then looping onto the next greylist value.

Seems horribly inefficient - but i'm not sure where to start to do what I'm looking for.

Any ideas?

Thank you very much.

+1  A: 

Can you sort either file? Doing so would allow you to early-exit on your searches, speeding things up a lot - especially if you can sort both, in which case you'd only have to traverse each file once (since you just move ahead in whichever file is currently at a lower value).

Amber
Thank you Amber, that won't help in this case - but there is someplace else I can use it.
Chasester
+3  A: 

You could use grep -f for this.

grep -F -v -f whitelist.txt greylist.txt

The values from greylist.txt that are not in whitelist.txt are then on stdout, you could redirect that to a file if you need to.

The options of grep do the following:

  • -F: Interpret PATTERN as a list of fixed strings. (i.e. do not use regexes)
  • -v: Invert the sense of matching, to select non-matching lines.
  • -f: Obtain patterns from FILE, one per line.

See man grep

Peter van der Heijden