views:

152

answers:

4

Hello. I have a file of IP addresses called "IPs". When I parse a new IP from my logs, I'd like to see if the new IP is already in file IPs, before I add it. I know how to add the new IP to the file, but I'm having trouble seeing if the new IP is already in the file.

!/usr/bin/python
from IPy import IP
IP = IP('192.168.1.2')
#f=open(IP('IPs', 'r'))  #This line doesn't work
f=open('IPs', 'r')    # this one doesn't work 
for line in f:
    if IP == line:
       print "Found " +IP +" before"
f.close()

In the file "IPs", each IP address is on it's own line. As such:

222.111.222.111
222.111.222.112

Also tried to put the file IPs in to an array, but not having good luck with that either. Any ideas?

Thank you,

Gary

+1  A: 

Why do you need this IP thing? Use simple strings.

!#/usr/bin/env python

ip = "192.168.1.2" + "\n" ### Fixed -- see comments
f = open('IPs', 'r')
for line in f:
    if line.count(ip):
       print "Found " + ip
f.close()

Besides, this looks more like a task for grep and friends.

hudolejev
I did this in grep already and it works. But a co-worker suggested I learn python and I'm like python.<br>About your code, if you put 192.168.1.22 in file IPs, this code says "Found 192.168.1.2", and that's why you need the "from IPy import IP" thing.
Gary
there is a bug indeed but it's easy to fix: replace `line.count(ip)` with `line==ip` and change the definition above to `ip = "192.168.1.2" + "\n"` (better to do that once outside the loop than do strip() or concatenation multiple times inside)
Nas Banov
Yes, my bad, thanks for fixing. `count()` was used to ignore leading spaces, if any. EnTerr's suggestion with `+ "\n"` is the right way to go if you are sure there will be no leading spaces before IPs. Otherwise, consider using regular expressions.
hudolejev
+3  A: 
iplist = []

# With takes care of all the fun file handling stuff (closing, etc.)
with open('ips.txt', 'r') as f:
    for line in f:
        iplist.append(line.strip())   # Gets rid of the newlines at the end

# Change the above to this for Python versions < 2.6
f = open('ips.txt', 'r')
for line in f:
    iplist.append(line.strip())
f.close()

newip = '192.168.1.2'

if newip not in iplist:
    f = open('ips.txt', 'a') # append mode, please
    f.write(newip+'\n')

Now you have your IPs in a list (iplist) and you can easily add your newip to it iplist.append(newip) or do anything else you please.


Edit:

Some excellent books for learning Python: If you're worried about programming being difficult, there's a book that's geared towards kids, but I honestly found it both easy-to-digest and informative. Snake Wrangling for Kids

Another great resource for learning Python is How to Think Like a Computer Scientist.

There's also the tutorial on the official Python website. It's a little dry compared to the previous ones.

Alan Gauld, one of the foremost contributors to the [email protected] mailing list has this tutorial that's really good and also is adapted to Python 3. He also includes some other languages for comparison.

If you want a good dead-tree book, I've heard that Core Python Programming by Wesley Chun is a really good resource. He also contributes to the python tutor list every so often.

The tutor list is another good place to learn about python - reading, replying, and asking your own questions. I actually learned most of my python by trying to answer as many of the questions I could. I'd seriously recommend subscribing to the tutor list if you want to learn Python.

Wayne Werner
Using a set instead of a list would make lookups faster.
tgray
Wayne, Your code works the way I needed, so THANK YOU!!! But I think i'm giving up and stay with Bash/grep. The box this python script is going on is running Python 2.4, it's a production box. My machine is running Python 2.6.5. The admin doesn't have plans to upgrade any time soon. SO SORRY I didn't mention this before, didn't think it mattered but it does.
Gary
Ah, yes. At least if you use `with` - I'll put up the similar code for < 2.6
Wayne Werner
and thanks for the "# Gets rid of the newlines at the end" comment cause i wasn't sure what line.strip was for.
Gary
BINGO! That works!!! THANKS!! I understand everything in your code, but i'd like to read more about "iplist.append(line.strip())". Can you give me any key phrases I can google and learn? Know of any good books for (you might be surprised by this) a noob non-programmer?Thanks Gary
Gary
considering how you use it, why do you even try to load those in a list or hash or set? if you going to read the whole file, do the check in the loop!
Nas Banov
One of the best places to look is in the python docs. Take a look at the strings: http://docs.python.org/library/stdtypes.html#string-methods and lists: http://docs.python.org/tutorial/datastructures.html I'd also recommend ipython as a great tool for learning: http://ipython.scipy.org/moin/ I thought I had posted a list of some good Python books, but I can't seem to find the post. I guess I'll just add them to my answer.
Wayne Werner
@EnTerr - That's how I'd do it if I wasn't worried about anything else: `found = False; if newip == line.strip(): found = True; break; if not found: #write line`, but the OP did mention putting the IPs in a list, presumably to do something else with. But I didn't mention that, so good catch.
Wayne Werner
@Wayne Werner: there is more pythonic way to do it if you use `for .. else`, see my code below. +1 for the plentitude of helpful links
Nas Banov
+2  A: 

It's a trivial code but i think it is short and pretty in Python, so here is how i'd write it:

ip = '192.168.1.2'

lookFor = ip + '\n'
f = open('ips.txt', 'a+')
for line in f:
    if line == lookFor:
        print 'found', ip, 'before.'
        break
else:
    print ip, 'not found, adding to file.'
    print >>f, ip
f.close()

It opens the file in append mode, reads and if not found (that's what else to a for does - executes if the loop exited normally and not via break) - appends the new IP. ta-da!

Now will be ineffective when you have a lot of IPs. Here is another hack i thought of, it uses 1 file per 1 IP as a flag:

import os

ip = '192.168.1.2'

fname = ip + '.ip'
if os.access(fname, os.F_OK):
    print 'found', ip, 'before.'
else:
    print ip, 'not found, registering.'
    open(fname, 'w').close()

Why is this fast? Because most file systems these days (except FAT on Windows but NTFS is ok) organize the list of files in a directory into a B-tree structure, so checking for a file existence is a fast operation O(log N) instead of enumerating whole list.

(I am not saying this is practical - depends on amount of IPs you expect to see and your sysadmin benevolence.)

Nas Banov
Great idea with separate files! Just tested with 65K IPs (192.168.*.*). Log parsing is done for 9414 us, whereas file system check takes 50 us (average of 3 tests, EXT4). Nearly 200 times faster! Fetch an upvote (:
hudolejev
A: 

Everyone, Thank you for all your help!!! There are so many great answers and comments here, that I'm saving this entire page so I may refer back to it as try the different suggestions. And I'll be bookmarking all the links. I have to start with Snake Wrangling for Kids and i'll show it to my 6 month old daughter when she is ready. THANK YOU EVERYONE!! Gary

Gary